Introduction

Analysis of a publicly available Toronto Parking Tickets dataset using Apache Spark and Scala programming. The project produces a JAR file that can be submitted to Apache Spark in Standalone or Cluster mode for example Google Dataproc or Amazon EMR. The Resilient Distributed Datasets (RDD), Map/Reduce, and ETL are some of the concepts used widely in this application.

Dataset Source

The entire dataset is available at this link

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
sqoop-tdot-tickets		sqoop-tdot-tickets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

sqoop-tdot-tickets

sqoop-tdot-tickets

README.md

README.md

Repository files navigation

Introduction

Dataset Source

About

Releases

Packages

Languages

iamjatindersingh/tdot-parking-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Introduction

Dataset Source

About

Resources

Stars

Watchers

Forks

Languages