Skip to content

Conversation

@matthchr
Copy link
Contributor

@matthchr matthchr commented Jan 14, 2019

  • Refactor TextSearch sample to more accurately reflect current best practices.
  • Use Microsoft.Azure.Batch 10.0.
  • Use Microsoft.Azure.Batch.Conventions.Files 3.3.0.
  • Use Microsoft.Azure.Batch.FileStaging 8.2.0.
  • Use Microsoft.Azure.Management.Batch 7.0.
  • Update Microsoft.IdentityModel.Clients.ActiveDirectory and Microsoft.Azure.Management.ResourceManager to the latest versions.
  • Clean up contributing.md.
  • Some minor code-formatting improvements.

@matthchr matthchr requested review from darylmsft and xingwu1 January 14, 2019 23:26
@matthchr
Copy link
Contributor Author

@bgklein - for some reason I can't request a review from you -- maybe because you're not an admin of the repo.

Copy link
Contributor

@bgklein bgklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple minor things

/// <param name="dumpStandardOutOnTaskSuccess">True to log the standard output file of the task even if it succeeded. False to not log anything if the task succeeded.</param>
/// <returns>The string containing the standard out of the file, or null if stdout could not be gathered.</returns>
public static async Task<string> CheckForTaskSuccessAsync(CloudTask boundTask, bool dumpStandardOutOnTaskSuccess)
public static async Task CheckForTaskSuccessAsync(CloudTask boundTask, bool dumpStandardOutOnTaskSuccess)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this return something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to, it can just raise if the task failed.

I need to fix the comment though.

private readonly string fileName;

/// <summary>
/// Constructs a mapper task object with the specified blob SAS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct method header

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

### About this sample
This map-reduce style sample uses Azure Batch to perform parallel text processing on an input file by splitting it up into multiple sub-files and performing regular expression matching on each sub-file. The results are then rolled-up into a final report by a reduction phase where it is uploaded to Azure Storage.
This map-reduce style sample uses Azure Batch to perform parallel text processing on an input file by splitting it up into multiple sub-files and performing regular expression matching on each sub-file.
This sample makes use of a variety of Azure Batch features, such as auto-pool functionality, task `ResourceFiles` for moving data into VMs, task `OutputFiles` for moving data out of VMs, and task dependencies for specifying tasks relationships to one another.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two new lines between new lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got rid of the line break actually as it's not really needed. The final text reads fine as one paragraph I think.

/// </summary>
/// <param name="cloudStorageAccount">The cloud storage account.</param>
/// <param name="containerName">The container name to construct a SAS for.</param>
/// <returns>The container URL with the SAS.</returns>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix method header

return containerSasUrl + "/" + blobName;
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how I managed to do that, but good catch - fixed.

@matthchr matthchr force-pushed the feature/batch-sdk-10-update branch from f0aa268 to dfcf2d3 Compare January 16, 2019 15:58
@matthchr
Copy link
Contributor Author

Thanks for reviewing @bgklein - I've updated according to your comments

@bgklein
Copy link
Contributor

bgklein commented Jan 16, 2019

CI failure?

/// Upload a file as a blob.
/// </summary>
/// <param name="cloudStorageAccount">The cloud storage account to upload the file to.</param>
/// <param name="containerName">The name of the container to upload the blob to.</param>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

container != containerName

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops thought they were the same. Anyways missed this method header in my initial review

### [TextSearch](./TextSearch)
This map-reduce style sample uses Azure Batch to perform parallel text processing on an input file by splitting it up into multiple sub-files and performing regular expression matching on each sub-file. A Job Manager task is used to orchestrate the mapper and reducer tasks with the results rolled-up into a final report. See the readme in the source directory for more information.
This map-reduce style sample uses Azure Batch to perform parallel text processing on an input file by splitting it up into multiple sub-files and performing regular expression matching on each sub-file.
This sample makes use of a variety of Azure Batch features, such as auto-pool functionality, task `ResourceFiles` for moving data into VMs, task `OutputFiles` for moving data out of VMs, and task dependencies for specifying tasks relationships to one another. See the readme in the source directory for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is readme technically supposed to be lowercased or upper? I honestly don't know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like all upper? README?

I'm pretty sure we're not consistent, but I'll go fix casing to be the same as the actual file on disk (whatever case that is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's lowercase on disk there, so at least we're matching... I think I'll leave this as is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, just kinda curious. As long as we are consistent its fine


#### JobManager
The job manager task submits mapper and reducer tasks and also monitors the status of those tasks.
Uploads the files required for the text processing to Azure Storage and submits a job to the Azure Batch service that utilizes the autopool functionality.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autopool was previously auto-pool above. Make up your mind

Copy link
Contributor

@bgklein bgklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple more minor things but otherwise looks good to me

@matthchr matthchr force-pushed the feature/batch-sdk-10-update branch from dfcf2d3 to 672616b Compare January 17, 2019 02:21
  - Use OutputFiles where it makes sense.
  - Use TaskDependencies in TextSearch, as it makes more sense than a JobManager for that case.
  - Microsoft.Azure.Batch.FileStaging and
    Microsoft.Azure.Batch.Conventions.Files.
@matthchr matthchr force-pushed the feature/batch-sdk-10-update branch from 672616b to f2b4d49 Compare January 17, 2019 18:04
@matthchr matthchr merged commit 49b490e into Azure-Samples:master Jan 17, 2019
@matthchr matthchr deleted the feature/batch-sdk-10-update branch January 17, 2019 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants