diff --git a/docs/cluster-management/uninstall.md b/docs/cluster-management/uninstall.md index 5ef66dde54..db2f5733bb 100644 --- a/docs/cluster-management/uninstall.md +++ b/docs/cluster-management/uninstall.md @@ -33,10 +33,10 @@ To delete them: export AWS_ACCESS_KEY_ID=*** export AWS_SECRET_ACCESS_KEY=*** -# identify the name of your cortex s3 bucket +# identify the name of your cortex S3 bucket aws s3 ls -# delete the s3 bucket +# delete the S3 bucket aws s3 rb --force s3:// # delete the log group (replace with what was configured during installation, default: cortex) diff --git a/docs/deployments/batch-api/endpoints.md b/docs/deployments/batch-api/endpoints.md index 0d5550fc69..56c322852b 100644 --- a/docs/deployments/batch-api/endpoints.md +++ b/docs/deployments/batch-api/endpoints.md @@ -62,14 +62,14 @@ RESPONSE: ### S3 file paths -If your input data is a list of files such as images/videos in an s3 directory, you can define `file_path_lister` in your submission request payload. You can use `file_path_lister.s3_paths` to specify a list of files or prefixes, and `file_path_lister.includes` and/or `file_path_lister.excludes` to remove unwanted files. The s3 file paths will be aggregated into batches of size `file_path_lister.batch_size`. To learn more about fine-grained S3 file filtering see [filtering files](#filtering-files). +If your input data is a list of files such as images/videos in an S3 directory, you can define `file_path_lister` in your submission request payload. You can use `file_path_lister.s3_paths` to specify a list of files or prefixes, and `file_path_lister.includes` and/or `file_path_lister.excludes` to remove unwanted files. The S3 file paths will be aggregated into batches of size `file_path_lister.batch_size`. To learn more about fine-grained S3 file filtering see [filtering files](#filtering-files). __The total size of a batch must be less than 256 KiB.__ This submission pattern can be useful in the following scenarios: -* you have a list of images/videos in an s3 directory -* each s3 file represents a single sample or a small number of samples +* you have a list of images/videos in an S3 directory +* each S3 file represents a single sample or a small number of samples If a single S3 file contains a lot of samples/rows, try the next submission strategy. @@ -78,10 +78,10 @@ POST /: { "workers": , # the number of workers to allocate for this job (required) "file_path_lister": { - "s3_paths": [], # can be s3 prefixes or complete s3 paths (required) + "s3_paths": [], # can be S3 prefixes or complete S3 paths (required) "includes": [], # glob patterns (optional) "excludes": [], # glob patterns (optional) - "batch_size": , # the number of s3 file paths per batch (the predict() function is called once per batch) (required) + "batch_size": , # the number of S3 file paths per batch (the predict() function is called once per batch) (required) } "config": { # custom fields for this specific job (will override values in `config` specified in your api configuration) (optional) "string": @@ -102,22 +102,22 @@ RESPONSE: ### Newline delimited JSON files in S3 -If your input dataset is a newline delimited json file in an s3 directory (or a list of them), you can define `delimited_files` in your request payload to break up the contents of the file into batches of size `delimited_files.batch_size`. +If your input dataset is a newline delimited json file in an S3 directory (or a list of them), you can define `delimited_files` in your request payload to break up the contents of the file into batches of size `delimited_files.batch_size`. -Upon receiving `delimited_files`, your Batch API will iterate through the `delimited_files.s3_paths` to generate the set of s3 files to process. You can use `delimited_files.includes` and `delimited_files.excludes` to filter out unwanted files. Each S3 file will be parsed as a newline delimited JSON file. Each line in the file should be a JSON object, which will be treated as a single sample. The S3 file will be broken down into batches of size `delimited_files.batch_size` and submitted to your workers. To learn more about fine-grained S3 file filtering see [filtering files](#filtering-files). +Upon receiving `delimited_files`, your Batch API will iterate through the `delimited_files.s3_paths` to generate the set of S3 files to process. You can use `delimited_files.includes` and `delimited_files.excludes` to filter out unwanted files. Each S3 file will be parsed as a newline delimited JSON file. Each line in the file should be a JSON object, which will be treated as a single sample. The S3 file will be broken down into batches of size `delimited_files.batch_size` and submitted to your workers. To learn more about fine-grained S3 file filtering see [filtering files](#filtering-files). __The total size of a batch must be less than 256 KiB.__ This submission pattern is useful in the following scenarios: -* one or more s3 files contains a large number of samples and must be broken down into batches +* one or more S3 files contains a large number of samples and must be broken down into batches ```yaml POST /: { "workers": , # the number of workers to allocate for this job (required) "delimited_files": { - "s3_paths": [], # can be s3 prefixes or complete s3 paths (required) + "s3_paths": [], # can be S3 prefixes or complete S3 paths (required) "includes": [], # glob patterns (optional) "excludes": [], # glob patterns (optional) "batch_size": , # the number of json objects per batch (the predict() function is called once per batch) (required) @@ -201,7 +201,7 @@ RESPONSE: When submitting a job using `delimited_files` or `file_path_lister`, you can use `s3_paths` in conjunction with `includes` and `excludes` to precisely filter files. -The Batch API will iterate through each s3 path in `s3_paths`. If the s3 path is a prefix, it iterates through each file in that prefix. For each file, if `includes` is non-empty, it will discard the s3 path if the s3 file doesn't match any of the glob patterns provided in `includes`. After passing the `includes` filter (if specified), if the `excludes` is non-empty, it will discard the s3 path if the s3 files matches any of the glob patterns provided in `excludes`. +The Batch API will iterate through each S3 path in `s3_paths`. If the S3 path is a prefix, it iterates through each file in that prefix. For each file, if `includes` is non-empty, it will discard the S3 path if the S3 file doesn't match any of the glob patterns provided in `includes`. After passing the `includes` filter (if specified), if the `excludes` is non-empty, it will discard the S3 path if the S3 files matches any of the glob patterns provided in `excludes`. If you aren't sure which files will be processed in your request, specify the `dryRun=true` query parameter in the job submission request to see the target list. diff --git a/docs/deployments/batch-api/predictors.md b/docs/deployments/batch-api/predictors.md index b245980caf..0f51d4b099 100644 --- a/docs/deployments/batch-api/predictors.md +++ b/docs/deployments/batch-api/predictors.md @@ -81,7 +81,7 @@ For proper separation of concerns, it is recommended to use the constructor's `c ### Examples -You can find an example of a BatchAPI using a PythonPredictor in [examples/batch/image-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/batch/image-classifier) +You can find an example of a BatchAPI using a PythonPredictor in [examples/batch/image-classifier](https://github.com/cortexlabs/cortex/tree/master/examples/batch/image-classifier). ### Pre-installed packages @@ -198,7 +198,7 @@ For proper separation of concerns, it is recommended to use the constructor's `c ### Examples -You can find an example of a BatchAPI using a TensorFlowPredictor in [examples/batch/tensorflow](https://github.com/cortexlabs/cortex/tree/master/examples/batch/tensorflow) +You can find an example of a BatchAPI using a TensorFlowPredictor in [examples/batch/tensorflow](https://github.com/cortexlabs/cortex/tree/master/examples/batch/tensorflow). ### Pre-installed packages @@ -267,7 +267,7 @@ For proper separation of concerns, it is recommended to use the constructor's `c ### Examples -You can find an example of a BatchAPI using an ONNXPredictor in [examples/batch/onnx](https://github.com/cortexlabs/cortex/tree/master/examples/batch/onnx) +You can find an example of a BatchAPI using an ONNXPredictor in [examples/batch/onnx](https://github.com/cortexlabs/cortex/tree/master/examples/batch/onnx). ### Pre-installed packages diff --git a/examples/batch/image-classifier/README.md b/examples/batch/image-classifier/README.md index 7ae684f58f..e723a5360a 100644 --- a/examples/batch/image-classifier/README.md +++ b/examples/batch/image-classifier/README.md @@ -415,7 +415,7 @@ spinning up workers... The status of your job, which you can get from `cortex get `, should change from `running` to `succeeded` once the job has completed. If it changes to a different status, you may be able to find the stacktrace using `cortex logs `. If your job has completed successfully, you can view the results of the image classification in the S3 directory you specified in the job submission. -Using AWS CLI: +Using the AWS CLI: ```bash $ aws s3 ls $CORTEX_DEST_S3_DIR// @@ -524,7 +524,7 @@ spinning up workers... The status of your job, which you can get from `cortex get `, should change from `running` to `succeeded` once the job has completed. If it changes to a different status, you may be able to find the stacktrace using `cortex logs `. If your job has completed successfully, you can view the results of the image classification in the S3 directory you specified in the job submission. -Using AWS CLI: +Using the AWS CLI: ```bash $ aws s3 ls $CORTEX_DEST_S3_DIR//