Pig

Pig Latin script that I used to process 0.5 TB of data (billion triple data set) using Amazon Elastic MapReduce. This was one of the optional assignments on the Introduction to Data Science course on Coursera.

What this script does: Reads in RDF data from the billion triple dataset, groups data by counts, and outputs a histogram of the distribution of counts across the subjects.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
histogram.pig		histogram.pig

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pig

About

Releases

Packages

gawecoti/Pig

Folders and files

Latest commit

History

Repository files navigation

Pig

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages