Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# A notebook whirlwind tour of Common Crawl's datasets on AWS

This repo provides an example Amazon Sagemaker notebook showcasing the
Common Crawl Dataset on AWS.
This repo provides an example notebook showcasing the
Common Crawl Dataset on AWS. It has been tested on Amazon SageMaker
and Jupyter Notebook, but designed to run seamlessly in any Python notebook
environment that support standard Jupyter .ipynb execution.

All Common Crawl datasets are hosted on AWS s3 buckets and are
publicly available through `s3` and `https` protocols. To read more
Expand Down