# realtion between accidents and types


In [1]:
import pandas as pd

# Load the data
file_path = 'Quetta.xlsx'
sheet_name = 'accident types'
df = pd.read_excel(file_path, sheet_name=sheet_name)


In [2]:
import statsmodels.api as sm

# Define the dependent and independent variables
X = df[['Derailments', 'Collisions', 'Collisions at LC']]
y = df['No of Accidents']

# Add a constant to the model (intercept)
X = sm.add_constant(X)

# Fit the GLM model
glm_poisson = sm.GLM(y, X, family=sm.families.Poisson()).fit()

# Get the summary of the model
summary = glm_poisson.summary()
print(summary)


                 Generalized Linear Model Regression Results                  
Dep. Variable:        No of Accidents   No. Observations:                    6
Model:                            GLM   Df Residuals:                        2
Model Family:                 Poisson   Df Model:                            3
Link Function:                    Log   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -10.853
Date:                Wed, 31 Jul 2024   Deviance:                      0.61883
Time:                        17:05:06   Pearson chi2:                    0.615
No. Iterations:                     4   Pseudo R-squ. (CS):             0.6003
Covariance Type:            nonrobust                                         
                       coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------------
const                0.8561      0.513  

## interpretation

 railway accidents in the Quetta region

Thus, we have used a model in the qualitative research method in which we have analyzed railway accidents of the Quetta region to determine the causes that lead to number of accidents. The model considered three main factors: there have been cases of derailments, collisions, and collisions at level crossings.

Derailments

When coming to the results, one can note that the cases of derailments affect the size of the number of accidents. Namely, each new type of derailment is associated with a roughly 15% increase in the number of accidents. From this one could infer that derailments are a significant cause of the occurrence of accidents in the city of Quetta because tackling this problem could lead to the minimization of overall accidents.

Collisions

Looking at the interaction with collisions, it is possible to note that the amount of collisions reveals a 16% raise for every additional collision. However, this effect is not significant in our setting meaning while collision does appear to influence a higher number of accident it is not a potent factor compared to derailment.

Collisions at Level Crossings

Level crossing accidents also record a surge in the number of accidents, whereby each collision is likely to cause a thirteen percent increase in accidents. However, similar to collisions, this factor is, therefore, not identified as having significant risk by significance analysis in this study, which implies that its impact is comparatively small than those of derailments.

Model Fit

The model fits slightly more than 60% of the variation in the numbers of accidents, which is a good fit. This implies that the factors that we captured (especially the derailments) are fairly valid in regards to the assessment of accident rate and possibly trends in the Quetta area.

Summary

Concisely, it could be estimated that the derailments are the most frequent type of the railway accidents in Quetta. However, collisions, level crossing accidents also contribute; but they appear not to be as compelling. It might be necessary to emphasize the efforts aimed at decreasing the number of derailments for increasing the railway safety in the area.

# Location

In [3]:
import pandas as pd
import statsmodels.api as sm
from statsmodels.formula.api import glm

# Load the data
file_path = 'Quetta.xlsx'
sheet_name = 'Location of Accidents'
df = pd.read_excel(file_path, sheet_name=sheet_name)

# Replace spaces in column names with underscores
df.columns = df.columns.str.replace(' ', '_')


# Fit the GLM excluding the 'Year' variable
model = glm('No_of_Accidents ~ Accidents_at_Track + Accidents_in_Station_Limits',
            data=df,
            family=sm.families.Poisson()).fit()

# Print the summary of the model
print(model.summary())

                 Generalized Linear Model Regression Results                  
Dep. Variable:        No_of_Accidents   No. Observations:                    6
Model:                            GLM   Df Residuals:                        3
Model Family:                 Poisson   Df Model:                            2
Link Function:                    Log   Scale:                          1.0000
Method:                          IRLS   Log-Likelihood:                -10.654
Date:                Wed, 31 Jul 2024   Deviance:                      0.22088
Time:                        17:09:03   Pearson chi2:                    0.220
No. Iterations:                     4   Pseudo R-squ. (CS):             0.6259
Covariance Type:            nonrobust                                         
                                  coef    std err          z      P>|z|      [0.025      0.975]
-----------------------------------------------------------------------------------------------
Intercept         

## interpretation 

Railway Fall Analysis of the Quetta Province

In examining factors related to the number of railway accidents in Quetta, our GLM model included two key variables: accidents at track locations and, the other is accidents within station limits.

Accidents at Track Locations

The findings show that an approximate increase in the total number of accidents by about 17% is evidenced by every other accident at track locations. Unfortunately, this effect is not even close to being statistically different form zero as its p-value equals 0. 079, meaning while the relation exists it is not strong enough for the statistical testing of our data at this point. However, what deserves attention is the fact that accidents at track locations might be a significant source of impacts affecting the number of general accidents.

Accidents in Station Limits

For accidents that happen within the station limits, the outcome of the model exhibits an estimated increase of about 19% in the total number of accidents. However, this variable is not significant (p-value = 0. 732), therefore, it can not be considered as proving strong relationship between variable with the number of accidents as depicted in the analysis. This implies that the kind of accidents that occur within station limits might not necessarily have a great influence on the total numeracy of the accidents than any other factors.

Model Fit

The model explains roughly 62 percent of the variation of the Kenyan economy. Namely, coefficients of determination amounted to 6% of the variation in the number of accidents, which attests to a high reliability of the model. This indicates that the factors incorporated in the model, that is, accidents at track locations and within station limits, give a satisfactory account about the variations in accidental frequencies in the Quarter of Quetta.

Summary

To sum up, reviewing the number of accidents at track locations to conclude that a relationship between these locations and a larger number of accidents could be attributed indicates a lack of clear verification. From this model, it can be seen that accidents that occur within a station’s limits do not seem to have a direct relation to increasing the total number of accidents. Thus, increasing the safety measures at the places of track might be informative while looking to decrease accident frequencies in Quetta.
