# Proyek Analisis Data: Bike Sharing Dataset
- **Nama:** Meisy Nathania Yogianty
- **Email:** meisynathania.y@gmail.com
- **ID Dicoding:** meisynathania

## Menentukan Pertanyaan Bisnis

- Pada musim apa sepeda paling banyak disewakan?
- Pada jam berapa bisnis persewaan sepeda ideal dibuka?

## Import Semua Packages/Library yang Digunakan

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import datetime
sns.set(style='dark')

## Data Wrangling

### Gathering Data

In [None]:
daily = pd.read_csv("day.csv")
daily.head()

In [None]:
hourly = pd.read_csv("hour.csv")
hourly.head()

### Assessing Data

In [None]:
daily.info()

In [None]:
hourly.info()

In [None]:
print(hourly.isna().sum())
print(daily.isna().sum())
print(daily.duplicated().sum())
print(hourly.duplicated().sum())

In [None]:
daily.describe()

In [None]:
hourly.describe()

### Cleaning Data

In [None]:
daily['dteday']= pd.to_datetime(daily['dteday'])
daily['season'].replace([1,2,3,4],['springer','summer', 'fall', 'winter'],inplace=True)
daily['yr'].replace([0,1],['2011','2012'],inplace=True)
daily['mnth'] = daily['mnth'].astype(str)
daily['holiday'].replace([0,1],['not holiday','holiday'],inplace=True)
daily['weekday'] = daily['weekday'].astype(str)
daily['workingday'].replace([0,1],['weekend/holiday','workingday'],inplace=True)
daily.info()

In [None]:
hourly['dteday']= pd.to_datetime(hourly['dteday'])
hourly['season'].replace([1,2,3,4],['springer','summer', 'fall', 'winter'],inplace=True)
hourly['yr'].replace([0,1],['2011','2012'],inplace=True)
hourly['mnth'] = hourly['mnth'].astype(str)
hourly['holiday'].replace([0,1],['not holiday','holiday'],inplace=True)
hourly['weekday'] = hourly['weekday'].astype(str)
hourly['workingday'].replace([0,1],['weekend/holiday','workingday'],inplace=True)
hourly.info()

In [None]:
hourly.to_csv("hourly_clean.csv", index=False)
daily.to_csv("daily_clean.csv", index=False)

## Exploratory Data Analysis (EDA)

### Explore Descriptive Statistics

In [None]:
daily_clean.describe(include = "all")

In [None]:
hourly_clean.describe(include = "all")

### Perbandingan Sepeda Tersewa saat Holiday dan Not Holiday


In [None]:
daily_clean.groupby(by="holiday").agg({
    "instant": "nunique",
    "cnt": ["max", "min", "mean", "std"]
})
hourly_clean.groupby(by="hr").agg({
    "cnt": ["max", "min", "mean", "std"],
})

### Perbandingan peminjaman sepeda di setiap musim

In [None]:
seasonal = pd.DataFrame(daily_clean.groupby(by=["season","yr"]).agg({
    "cnt": ["sum"],
}).unstack())
seasonal

In [None]:
grouped_data = seasonal.groupby(['season'])['cnt'].sum().unstack()
grouped_data

## Visualization & Explanatory Analysis

### Pertanyaan 1:

In [None]:
grouped_data.plot(kind='bar', colormap='tab20')

### Pertanyaan 2:

In [None]:
hourly_day = hourly_clean[hourly_clean["dteday"] == '2011-01-01']
#tanggalnya dapat diganti untuk melihat perubahan per hari
plt.figure(figsize=(10, 5))
plt.plot(hourly_day["hr"], hourly_day["cnt"], marker='o', linewidth=2, color="#72BCD4")
plt.title("Banyak Sepeda Tersewa per Jam", loc="center", fontsize=20)
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)
plt.show()

## Conclusion

1. Pada musim gugur (fall), sepeda berhasil disewakan dengan jumlah paling banyak. Berikutnya, musim panas (summer) berada di urutan kedua, lalu musim salju (winter) di urutan ketiga, dan terakhir musim semi (spring) di urutan terakhir. Optimasi revenue dapat dilakukan di musim gugur, panas, dan salju.
2. Berdasarkan tren penyewa, bisnis ini paling cocok untuk memiliki jam operasi dari pukul 05.00-20.00