Skip to content

Ehijator/BritishPoliceData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

py versions license

British Police Analytics

image

The large overview of this project ingesting data from the Official British Police API, placing this schedule to constantly check for API updates, storing the data into a data warehouse and pulling analytics from that source.

ETL Overview

ETL

This process begins with pulling data using the requests library, the script that fetches data is a class called Access() and is made of 3 functions:

  • Access().historic_Pipeline() handles bringing in all the data from the api currently available to the public to date, this was used in the initial ingestion.
  • Access().monthly_Pipeline() handles the monthly data pipeline and will be called daily to check for updates.
  • Access().load_To_Db() this loads the data into the staging tables within SQL Server.

Final

After the retrival of the data it was then stored in the staging table called Crime Staging to be transformed to meet the specifications for a Star Schema, this database origanisational structure is optimized for Business Intelligence and Data Warehouses.

To attain these specifications SSIS will be used for the batch processing of the data where MSSQL scripts will be stored and used in modelling the data while being exectuted sequentially and maintaining reusablity, the Dimension Tables will be modelled first followed by the Fact Table. This package will be triggered with the arrival of new data from the Access() class.