car-accident-analysis

Index

Summary

The following project uses Python and PySpark to simulate how to leverage big data processing to analyze car crashes in the UK. The attached Jupyter Notebook could be used in conjunction with databricks to process the data across a real cluster.

File Directory

data - contains the four files used in analysis:
       a. Acc.csv - 2017 accident data reported by the UK police force.
       b. Cas.csv - 2017 casualty data reported by the UK police force.
       c. Veh.csv - 2017 vehicle data reported by the UK police force.
       c. dictionary.xls - Data dictionary used to define coded categorical values within datasets.
images - contains visualizations:
a. uk_accidents.png - Heatmap showing accidents in the UK by accident severity.
car_crash.ipynb - Jupyter Notebook containing all analysis performed on the datasets, along with visualizations.

Language and Packages Used

Python is used in conjunction with Pyspark for all analysis performed.

The following commands will import all necessary packages:

import pyspark, os, zipfile
import pandas as pd
import urllib.request
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
from pyspark.sql import SQLContext
from pyspark import SparkContext
from pyspark.sql.types import IntegerType

Installing PySpark

PySpark takes special configuration to install and run within Jupyter Notebook:

If you're using windows, Michael Galarnyk has an excellent tutorial on installing PySpark for windows.
If you are installing on Linux or Mac OS, Charles Bochet's article will get you started.

Credits

Would like to thank the UK goverment for posting the data on their website.
Would like to thank the stackoverflow user whose function I stole, because of you lot I get to stand on the shoulders of giants.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
data		data
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
car_crash.ipynb		car_crash.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

car-accident-analysis

Index

Summary

File Directory

Language and Packages Used

Installing PySpark

Credits

License

About

Releases

Packages

Languages

License

ijeffries/car-accident-analysis

Folders and files

Latest commit

History

Repository files navigation

car-accident-analysis

Index

Summary

File Directory

Language and Packages Used

Installing PySpark

Credits

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages