# 2025 Shalat Data Cleanup

Goal : 
- Practice exploratory data analysis using 2025 shalat data.
- Create a unified 2025 shalat record in a single csv.

## Observation

Observing pattern and fining anomalies in monthly data data

In [1]:
!ls 

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


### January

In [6]:
# Load the data
import pandas as pd

january = pd.read_csv("shalat_2025-jan.csv")

In [7]:
january.head()

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,January,1,Dzuhur,Committed,Jamaah,Mosque
1,January,1,Ashar,Committed,Jamaah,Mosque
2,January,1,Maghrib,Committed,Jamaah,Mosque
3,January,1,Isya,Committed,Jamaah,Mosque
4,January,1,Dzuhur,Committed,Jamaah,Mosque


In [8]:
january.tail()

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
150,January,31,Shubuh,Abandoned,,
151,January,31,Maghrib,Abandoned,,
152,January,31,Shubuh,Abandoned,,
153,January,31,Ashar,Abandoned,,
154,January,31,Maghrib,Abandoned,,


In [24]:
# a function to find totally empty rows
def empty(month):
    empty = month.index[month.isna().all(axis=1)].tolist()
    return empty

empty(january)

[]

In [89]:
january.dtypes

Month         object
Date           int64
Shalat        object
Commitment    object
Person        object
Place         object
dtype: object

In [99]:
def flt_to_int(column):
    """Convert foloat column into int"""
    column["Date"] = column["Date"].astype(int)

There is no empty rows in january data but, the beginning and the end looks off. If I recall correctly, I have never prayed 5 times a day in mosque, only about 2-3 times a day, neither I have abandoned prayers completely. 

The data is useful for summary calculation, but not for detailed-daily calculation.

### February 

In [36]:
february = pd.read_csv("shalat_2025-feb.csv")

In [37]:
february.head()

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,February,1.0,Shubuh,Abandoned,,
1,February,1.0,Dzuhur,Committed,Jamaah,Mosque
2,February,1.0,Ashar,Committed,Jamaah,Mosque
3,February,1.0,Maghrib,Committed,Alone,Home
4,February,1.0,Isya,Committed,Jamaah,Mosque


In [38]:
february.tail()

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
150,February,,,,,
151,February,,,,,
152,February,,,,,
153,February,,,,,
154,February,,,,,


In [39]:
empty(february)

[]

February has no totally empty rows, but I inputted wrong dates. Last 5 dates have to be removed

In [44]:
february.drop(february.tail(5).index, inplace=True)

In [45]:
february.tail()

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
135,February,28.0,Shubuh,Abandoned,,
136,February,28.0,Dzuhur,Committed,Jamaah,Mosque
137,February,28.0,Ashar,Committed,Jamaah,Mosque
138,February,28.0,Maghrib,Committed,Alone,Home
139,February,28.0,Isya,Committed,Alone,Home


In [86]:
february.dtypes

 Month         object
Date          float64
Shalat         object
Commitment     object
Person         object
Place          object
dtype: object

In [94]:
flt_to_int(february)

In [95]:
february.dtypes

 Month        object
Date           int64
Shalat        object
Commitment    object
Person        object
Place         object
dtype: object

### March

In [46]:
!ls

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


In [110]:
march = pd.read_csv("shalat_2025-mar.csv")
march

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,March,1,Shubuh,Committed,Alone,Home
1,March,1,Dzuhur,Committed,Jamaah,Mosque
2,March,1,Ashar,Committed,Alone,Home
3,March,1,Maghrib,Committed,Alone,Home
4,March,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,March,31,Shubuh,Committed,Alone,Home
151,March,31,Dzuhur,Abandoned,,
152,March,31,Ashar,Committed,Alone,Mosque
153,March,31,Maghrib,Committed,Jamaah,Mosque


March is okay. No changed needed.

### April

In [48]:
!ls

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


In [50]:
april = pd.read_csv("shalat_2025-apr.csv")
april

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place,Weather
0,April,1.0,Shubuh,Committed,Alone,Home,Rain
1,April,1.0,Dzuhur,Committed,Alone,Home,
2,April,1.0,Ashar,Committed,Alone,Home,
3,April,1.0,Maghrib,Abandoned,,,
4,April,1.0,Isya,Committed,Alone,Home,
...,...,...,...,...,...,...,...
150,April,,,,,,
151,April,,,,,,
152,April,,,,,,
153,April,,,,,,


In [60]:
april.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   Month       150 non-null    object 
 1   Date        150 non-null    float64
 2   Shalat      150 non-null    object 
 3   Commitment  150 non-null    object 
 4   Person      103 non-null    object 
 5   Place       103 non-null    object 
 6   Weather     4 non-null      object 
dtypes: float64(1), object(6)
memory usage: 8.3+ KB


In [51]:
def drop_tail(df_name):
    df_name.drop(df_name.tail(5).index, inplace=True)

In [52]:
drop_tail(april)

In [53]:
april

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place,Weather
0,April,1.0,Shubuh,Committed,Alone,Home,Rain
1,April,1.0,Dzuhur,Committed,Alone,Home,
2,April,1.0,Ashar,Committed,Alone,Home,
3,April,1.0,Maghrib,Abandoned,,,
4,April,1.0,Isya,Committed,Alone,Home,
...,...,...,...,...,...,...,...
145,April,30.0,Shubuh,Abandoned,,,
146,April,30.0,Dzuhur,Committed,Alone,Home,
147,April,30.0,Ashar,Committed,Jamaah,Mosque,
148,April,30.0,Maghrib,Abandoned,,,


In [97]:
flt_to_int(april)

In [98]:
april

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place,Weather
0,April,1,Shubuh,Committed,Alone,Home,Rain
1,April,1,Dzuhur,Committed,Alone,Home,
2,April,1,Ashar,Committed,Alone,Home,
3,April,1,Maghrib,Abandoned,,,
4,April,1,Isya,Committed,Alone,Home,
...,...,...,...,...,...,...,...
145,April,30,Shubuh,Abandoned,,,
146,April,30,Dzuhur,Committed,Alone,Home,
147,April,30,Ashar,Committed,Jamaah,Mosque,
148,April,30,Maghrib,Abandoned,,,


In [149]:
april = april.drop("Weather", axis=1)

In [150]:
april

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,April,1,Shubuh,Committed,Alone,Home
1,April,1,Dzuhur,Committed,Alone,Home
2,April,1,Ashar,Committed,Alone,Home
3,April,1,Maghrib,Abandoned,,
4,April,1,Isya,Committed,Alone,Home
...,...,...,...,...,...,...
145,April,30,Shubuh,Abandoned,,
146,April,30,Dzuhur,Committed,Alone,Home
147,April,30,Ashar,Committed,Jamaah,Mosque
148,April,30,Maghrib,Abandoned,,


### May

In [54]:
!ls

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


In [57]:
may = pd.read_csv("shalat_2025-mar.csv")

In [58]:
may

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,March,1,Shubuh,Committed,Alone,Home
1,March,1,Dzuhur,Committed,Jamaah,Mosque
2,March,1,Ashar,Committed,Alone,Home
3,March,1,Maghrib,Committed,Alone,Home
4,March,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,March,31,Shubuh,Committed,Alone,Home
151,March,31,Dzuhur,Abandoned,,
152,March,31,Ashar,Committed,Alone,Mosque
153,March,31,Maghrib,Committed,Jamaah,Mosque


### June

In [61]:
!ls

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


In [63]:
june = pd.read_csv("shalat_2025-jun.csv")
june

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,June,1.0,Shubuh,Committed,Alone,Home
1,June,1.0,Dzuhur,Committed,Alone,Home
2,June,1.0,Ashar,Committed,Alone,Mosque
3,June,1.0,Maghrib,Committed,Alone,Home
4,June,1.0,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,June,,,,,
151,June,,,,,
152,June,,,,,
153,June,,,,,


In [64]:
drop_tail(june)
june

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,June,1.0,Shubuh,Committed,Alone,Home
1,June,1.0,Dzuhur,Committed,Alone,Home
2,June,1.0,Ashar,Committed,Alone,Mosque
3,June,1.0,Maghrib,Committed,Alone,Home
4,June,1.0,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
145,June,30.0,Shubuh,Abandoned,,
146,June,30.0,Dzuhur,Abandoned,Jamaah,Mosque
147,June,30.0,Ashar,Abandoned,Alone,Home
148,June,30.0,Maghrib,Abandoned,Alone,Home


In [100]:
flt_to_int(june)

In [101]:
june

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,June,1,Shubuh,Committed,Alone,Home
1,June,1,Dzuhur,Committed,Alone,Home
2,June,1,Ashar,Committed,Alone,Mosque
3,June,1,Maghrib,Committed,Alone,Home
4,June,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
145,June,30,Shubuh,Abandoned,,
146,June,30,Dzuhur,Abandoned,Jamaah,Mosque
147,June,30,Ashar,Abandoned,Alone,Home
148,June,30,Maghrib,Abandoned,Alone,Home


### July

In [65]:
!ls

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


In [66]:
july = pd.read_csv("shalat_2025-jul.csv")
july

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,July,1,Shubuh,Committed,Alone,Home
1,July,1,Dzuhur,Committed,Jamaah,Mosque
2,July,1,Ashar,Committed,Jamaah,Mosque
3,July,1,Maghrib,Committed,Alone,Home
4,July,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,July,31,Shubuh,Committed,Alone,Home
151,July,31,Dzuhur,Committed,Alone,Home
152,July,31,Ashar,Committed,Jamaah,Mosque
153,July,31,Maghrib,Committed,Alone,Home


### August

In [70]:
august = pd.read_csv("shalat_2025-aug.csv")
august

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,August,1,Shubuh,Committed,Alone,Home
1,August,1,Dzuhur,Committed,Jamaah,Mosque
2,August,1,Ashar,Committed,Alone,Home
3,August,1,Maghrib,Committed,Alone,Home
4,August,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,August,31,Shubuh,Committed,Alone,Home
151,August,31,Dzuhur,Committed,Alone,Home
152,August,31,Ashar,Committed,Jamaah,Mosque
153,August,31,Maghrib,Committed,Alone,Home


### September

In [72]:
september = pd.read_csv("shalat_2025-sep.csv")
september

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,September,1.0,Shubuh,Committed,Alone,Home
1,September,1.0,Dzuhur,Committed,Alone,Home
2,September,1.0,Ashar,Committed,Jamaah,Mosque
3,September,1.0,Maghrib,Committed,Alone,Home
4,September,1.0,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,September,,,,,
151,September,,,,,
152,September,,,,,
153,September,,,,,


In [73]:
drop_tail(september)
september

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,September,1.0,Shubuh,Committed,Alone,Home
1,September,1.0,Dzuhur,Committed,Alone,Home
2,September,1.0,Ashar,Committed,Jamaah,Mosque
3,September,1.0,Maghrib,Committed,Alone,Home
4,September,1.0,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
145,September,30.0,Shubuh,Committed,Alone,Home
146,September,30.0,Dzuhur,Committed,Alone,Mosque
147,September,30.0,Ashar,Abandoned,,
148,September,30.0,Maghrib,Committed,Alone,Home


In [102]:
flt_to_int(september)
september.dtypes

Month         object
Date           int64
Shalat        object
Commitment    object
Person        object
Place         object
dtype: object

### October 

In [81]:
october = pd.read_csv("shalat_2025-oct.csv")
october

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,October,1,Shubuh,Committed,Alone,Home
1,October,1,Dzuhur,Committed,Alone,Mosque
2,October,1,Ashar,Committed,Alone,Mosque
3,October,1,Maghrib,Abandoned,,
4,October,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,October,31,Shubuh,Committed,Alone,Home
151,October,31,Dzuhur,Committed,Alone,Mosque
152,October,31,Ashar,Committed,Alone,Home
153,October,31,Maghrib,Abandoned,,


### November

In [75]:
november = pd.read_csv("shalat_2025-nov.csv")
november

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,November,1.0,Shubuh,Committed,Alone,Home
1,November,1.0,Dzuhur,Abandoned,,
2,November,1.0,Ashar,Abandoned,,
3,November,1.0,Maghrib,Committed,Alone,Home
4,November,1.0,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,November,,,,,
151,November,,,,,
152,November,,,,,
153,November,,,,,


In [79]:
drop_tail(november)

In [80]:
november

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,November,1.0,Shubuh,Committed,Alone,Home
1,November,1.0,Dzuhur,Abandoned,,
2,November,1.0,Ashar,Abandoned,,
3,November,1.0,Maghrib,Committed,Alone,Home
4,November,1.0,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
145,November,30.0,Shubuh,Abandoned,,
146,November,30.0,Dzuhur,Committed,Alone,Home
147,November,30.0,Ashar,Committed,Jamaah,Mosque
148,November,30.0,Maghrib,Committed,Alone,Home


In [104]:
flt_to_int(november)
november.dtypes

Month         object
Date           int64
Shalat        object
Commitment    object
Person        object
Place         object
dtype: object

### December

In [83]:
!ls

shalat_2025-apr.csv  shalat_2025-jan.csv  shalat_2025-may.csv  Untitled.ipynb
shalat_2025-aug.csv  shalat_2025-jul.csv  shalat_2025-nov.csv
shalat_2025-dec.csv  shalat_2025-jun.csv  shalat_2025-oct.csv
shalat_2025-feb.csv  shalat_2025-mar.csv  shalat_2025-sep.csv


In [85]:
december = pd.read_csv("shalat_2025-dec.csv")
december

Unnamed: 0,Month,Date,Shalat,Commitment,Person,Place
0,December,1,Shubuh,Committed,Alone,Home
1,December,1,Dzuhur,Committed,Jamaah,Mosque
2,December,1,Ashar,Committed,Alone,Home
3,December,1,Maghrib,Committed,Alone,Home
4,December,1,Isya,Committed,Jamaah,Mosque
...,...,...,...,...,...,...
150,December,31,Shubuh,Abandoned,,
151,December,31,Dzuhur,Abandoned,,
152,December,31,Ashar,Committed,Jamaah,Mosque
153,December,31,Maghrib,Committed,Alone,Home


## Reconstruction

In [151]:
# Save all month data into new csv
# saving the dataframe
january.to_csv('january.csv', header=True, index=False)
february.to_csv('february.csv', header=True, index=False)
march.to_csv('march.csv', header=True, index=False)
april.to_csv('april.csv', header=True, index=False)
may.to_csv('may.csv', header=True, index=False)
june.to_csv('june.csv', header=True, index=False)
july.to_csv('july.csv', header=True, index=False)
august.to_csv('august.csv', header=True, index=False)
september.to_csv('september.csv', header=True, index=False)
october.to_csv('october.csv', header=True, index=False)
november.to_csv('november.csv', header=True, index=False)
december.to_csv('december.csv', header=True, index=False)

In [152]:
!ls

april.csv     february.csv  june.csv   november.csv  september.csv
august.csv    january.csv   march.csv  october.csv   shalat_2025.ipynb
december.csv  july.csv	    may.csv    old_data


In [155]:
shalat_2025 = pd.concat([january, february, march, april, 
                        may, june, july, august, september,
                        october, november, december])

In [157]:
shalat_2025.to_csv("shalat_2025.csv", header=True, index=False)

In [158]:
!ls

april.csv     february.csv  june.csv   november.csv  september.csv
august.csv    january.csv   march.csv  october.csv   shalat_2025.csv
december.csv  july.csv	    may.csv    old_data      shalat_2025.ipynb
