Skip to content

A simple time series regression to understand the different steps of modelization

Notifications You must be signed in to change notification settings

FabriceMesidor/TimeSeries_accident_UK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An analysis of daily accidents in UK from 2014 to 2017 using Time Series

 A simple time series regression to understand the different steps of modeling

Link to the article where I published this analysis!

Introduction

Linear regression is a very common model used by Data Scientist. An outcome or target variable is explained by a set of features. There is a case where the same variable is collected over time and we used a sequence of measurements of that variable made at regular time intervals. Welcome to Time Series. One difference from standard linear regression is that the data are not necessarily independent and not necessarily identically distributed. Working with time series can be frustrating as it implies that you have to find a correlation between the lag or errors of any previous prediction of the value and itself. Also, the ordering matters and changing the order will change the meaning of the data. Due to its complexity, Data Scientist got lost sometimes in the process of times series analysis. In this blog, I am going to share a full time series analysis guided by one of the well known Data Science methods: OSEMIN.

Context and Data used

The visual above shows the methodology used in my study from gathering the data to drawing conclusions. The data used for this analysis contained the date and amount of 1461 daily accidents in the UK from January 1st, 2014 to December 31, 2017. I used a dataset from from kaggle for this exercise. I downloaded an CSV file and used a popular python code 'pd.read_csv' to store it into a Data Frame. No other independent variables were considered in this analysis as I am focused on the time series. The main purpose of this study is to explain the different steps of a full data science project. Other objectives are to find out if the number of accidents in a day is dependent of the number of accidents in any given day. The 3 questions that the study is seeking to answer are: What is the relation between the amount of accident on a current day and the day prior? Is there any pattern that can help predict (or prevent) the amount of accident in UK on any given day? Is the month of the year or day of week related to the number of accident during that month?

Description of the data

Data Sample

Stats

Graphs and models

Daily Trends of accidents in UK

Daily Trends of accidents in UK - dot

Monthly Trends of accidents in UK

Weekly Trends of accidents in UK

Boxplot

Histogram

KDE

acf

pacf

model

About

A simple time series regression to understand the different steps of modelization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published