Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Merge pull request #3962 from Azure/FromPublicRepo

From public repo
  • Loading branch information...
commit 450406c898ee5d0de94fae8b0f6c8168382c7ffc 2 parents 90061fc + 4f61232
@v-aljenk v-aljenk authored
View
3  articles/hdinsight-use-oozie.md
@@ -90,8 +90,7 @@ Oozie workflows definitions are written in hPDL (a XML Process Definition Langua
The Hive action in the workflow calls a HiveQL script file. This script file contains three HiveQL statements:
1. **The DROP TABLE statement** deletes the log4j Hive table in case it exists.
-2. **The CREATE TABLE statement** creates a log4j Hive external table pointing to the
-3. location of the log4j log file. The field delimiter is ",". The default line delimiter is "\n". Hive external table is used to avoid the data file being removed from the original location, in case you want to run the Oozie workflow multiple times.
+2. **The CREATE TABLE statement** creates a log4j Hive external table pointing to the location of the log4j log file. The field delimiter is ",". The default line delimiter is "\n". Hive external table is used to avoid the data file being removed from the original location, in case you want to run the Oozie workflow multiple times.
3. **The INSERT OVERWRITE statement** counts the occurrences of each log level type from the log4j Hive table, and saves the output to an Azure Storage - Blob (WASB) location.
There is a known Hive path issue. You will run into this problem when submitting an Oozie job. The instructions for fixing the issue can be found at [TechNet Wiki][technetwiki-hive-error].
View
4 articles/machine-learning-faq.md
@@ -50,7 +50,7 @@ The **Reader** module can read data from Azure Table, Azure Blob, SQL Database (
8.**How large can my data set be?**
-Azure ML Studio supports training dataset of up to 10GB. There is no limit on the dataset size for Web Services. Sampling larger datasets via Hive or SQL queries before ingestion is also supported. If you are working with data larger than10GB, you can you can create multiple datasets and use the ‘Partition and Sample’, ‘Split’ or ‘Join’ modules to recombine these datasets in ML studio to create training sets for building predictive models. Visit module help in ML Studio to learn more about these modules.
+Azure ML Studio supports training dataset of up to 10GB. There is no limit on the dataset size for Web Services. Sampling larger datasets via Hive or SQL queries before ingestion is also supported. If you are working with data larger than 10GB, you can create multiple datasets and use the ‘Partition and Sample’, ‘Split’ or ‘Join’ modules to recombine these datasets in ML studio to create training sets for building predictive models. Visit module help in ML Studio to learn more about these modules.
For datasets larger than a couple GB, the recommended approach is to upload data to Azure storage or SQL Database (Azure) or use HDInsight, rather than directly uploading from local file.
@@ -109,4 +109,4 @@ We will be adding new material to Machine Learning Center on an ongoing basis. Y
Azure ML is supported as a part of the Azure support offering. To get technical support on Azure ML, select ‘Machine Learning’ as service and you will be provided a category of topics to file your support ticket. To learn more about Azure Support offering visit <http://azure.microsoft.com/en-us/support/options/>
-Azure Machine Learning also has a community forum on MSDN, where you can ask Azure ML related questions. The forum is monitored by the Azure ML team. Visit [Azure Forum](http://social.msdn.microsoft.com/Forums/windowsazure/en-US/home?forum=MachineLearning).
+Azure Machine Learning also has a community forum on MSDN, where you can ask Azure ML related questions. The forum is monitored by the Azure ML team. Visit [Azure Forum](http://social.msdn.microsoft.com/Forums/windowsazure/en-US/home?forum=MachineLearning).
View
2  articles/machine-learning-sample-prediction-of-number-of-bike-rentals.md
@@ -17,7 +17,7 @@ We used 2011 data as a training set and 2012 data as a test set. We compared 4 s
1. number of bikes that were rented in each of the previous 12 days at the same hour
1. number of bikes that were rented in each of the previous 12 weeks at the same hour and the same day
-Features B capture very recent demand for the bikes. Features C capture demand for bikes at particular hour. Features D capture demand for bikes at particular and particular day of the week.
+Features B capture very recent demand for the bikes. Features C capture demand for bikes at particular hour. Features D capture demand for bikes at particular hour and particular day of the week.
Since the label (number of rentals) is real-valued we have regression setting. Also, since the number of features is relatively small (less than 100) and they are not sparse, the decision boundary is probably nonlinear. Based on this, we decided to use boosted decision tree regression algorithm.
View
4 articles/storage-introduction.md
@@ -20,7 +20,7 @@ Azure Storage is elastic, so you can design applications for a large global audi
Azure Storage uses an auto-partitioning system that automatically load-balances your data based on traffic. This means that as the demands on your application grow, Azure Storage automatically allocates the appropriate resources to meet them.
-Azure Storage is accessible from anywhere in the world, from any type of application, whether it’s running in the cloud, on the desktop, on an on-premise server, or on a mobile or tablet device. You can use Azure Storage in mobile scenarios where the application stores a subset of data on the device and synchronizes it with a full set of data stored in the cloud.
+Azure Storage is accessible from anywhere in the world, from any type of application, whether it’s running in the cloud, on the desktop, on an on-premises server, or on a mobile or tablet device. You can use Azure Storage in mobile scenarios where the application stores a subset of data on the device and synchronizes it with a full set of data stored in the cloud.
Azure Storage supports clients using a diverse set of operating systems (including Windows and Linux) and a variety of programming languages (including .NET, Java, and C++) for convenient development. Azure Storage also exposes data resources via simple REST APIs, which are available to any client capable of sending and receiving data via HTTP/HTTPS.
@@ -76,7 +76,7 @@ For today's Internet-based applications, NoSQL databases like Table storage offe
## Queue Storage ##
-In designing applications for scale, application components are often decoupled, so that they can scale independently. Queue storage provides a reliable messaging solution for asynchronous communication between application components, whether they are running in the cloud, on the desktop, on an on-premise server, or on a mobile device. Queue storage also supports managing asynchronous tasks and building process workflows.
+In designing applications for scale, application components are often decoupled, so that they can scale independently. Queue storage provides a reliable messaging solution for asynchronous communication between application components, whether they are running in the cloud, on the desktop, on an on-premises server, or on a mobile device. Queue storage also supports managing asynchronous tasks and building process workflows.
A storage account can contain any number of queues. A queue can contain any number of messages, up to the 200 TB capacity limit of the storage account. Individual messages may be up to 64 KB in size.
Please sign in to comment.
Something went wrong with that request. Please try again.