### About the df
Each row is a bike. Only currently parked bikes are shown in each datta table. 

### **Column meanings:**
 
* city
* time_stamp 
* uid - another station ID, identifies station; but, if bike is free floating, the uique uid will be assigned
* lat 
* lng 
* bike - True means bike is free-floating; False meand bike is parked at a station
* name - station name, identifies station
* station_number - sation ID, identifies station
* booked_bikes - I guess, iif a bike is currently booked, the field equals 1, otherwise 0
* bikes
* bikes_available_to_rent
* bike_racks
* free_racks
* maintenance - boolean, probably means that a station needs maintenance
* terminal_type - missing for free floating bikes, missing for stations where bikes are blocked, "free" for actual stations where bikes are not blocked by racks, like in Dresden "Hauptbahnhof Nord" station
* place_type - 12 for free floating bikes, the rest are unique characteristic of a station
* rack_locks
* no_registration
* bike_number - unique ID of a bike, identifies a bike
* bike_type
* lock_types
* active - bike is active and available for rent
* state
* electric_lock
* boardcomputer - unique ID of bike's boardcomputer

# imports

In [39]:
import pyreadr
from datetime import datetime
from tqdm.notebook import tqdm
import seaborn as sns
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 20)

# nighttime df

In [None]:
df = pyreadr.read_r('../data/Dresden_01-03.24/2024-01-01/2024-01-01-23-59-00.rds')[None]

In [None]:
df.columns

In [None]:
len(df)

In [None]:
df[df.booked_bikes==1]

In [None]:
df.sort_values("uid").head(10)

In [None]:
df[df.booked_bikes!=0]

In [None]:
len(df.name.unique())

In [None]:
df.terminal_type.unique()

In [None]:
len(df.uid.unique())

In [None]:
df[df.station_number==43001]

In [None]:
df[df.uid==32938439]

In [None]:
df[df.terminal_type.isna()]

In [None]:
df.groupby(["bike", "terminal_type"]).size()

In [None]:
len(df.station_number.unique())

In [None]:
df.tail()

In [None]:
df.describe()

In [None]:
df.info()

In [None]:
df[df.bike  == False]

In [None]:
df[df.bike  == False].groupby("uid").size().sort_values()

In [None]:
df[df.bike  == True].groupby("uid").size().sort_values()

In [None]:
df[df.bike  == True]

In [None]:
df[df.station_number==43005]

In [None]:
df[df.station_number==43010]

In [None]:
df[df.maintenance==True]

# columns loop

In [None]:
for col in df.columns:
    print(col.upper())
    print( f"{len(df[col].unique())} unique values" )
    print(df.groupby(col).size().sort_values(ascending=False).head(5))
    print()

# daytime bike dataset

In [27]:
result = pyreadr.read_r('../data/Dresden_01-03.24/2024-01-10/2024-01-10-13-15-00.rds')
# 10.01.2024 is Wednesday
df_daytime = result[None]

In [None]:
len(df_daytime)

In [None]:
df_daytime

In [None]:
df_daytime[df_daytime.active!=True]

In [None]:
df_daytime[df_daytime.booked_bikes!=0]

In [None]:
df_daytime[df_daytime.station_number==43001]

In [None]:
df_daytime[df_daytime.maintenance==True]

In [57]:
# in the daytime, there are little more booked bikes


In [None]:
df_daytime[df_daytime.bikes>10]

In [None]:
df_daytime[df_daytime.station_number==43005]

In [None]:
df_daytime.place_type.unique()

In [None]:
df_daytime[df_daytime.place_type==21]

In [None]:
df_daytime[df_daytime.place_type==17]

In [None]:
df_daytime[df_daytime.place_type==18]

In [None]:
df_daytime[(df_daytime.place_type==12) & (df_daytime.bike==True)] 

In [None]:
name_place_type_gr = df_daytime.groupby(["name", "place_type"]).size()
name_place_type_gr = name_place_type_gr[name_place_type_gr!=0]

In [None]:
df_daytime.no_registration.unique()

# rushhour bike dataset

In [31]:
result = pyreadr.read_r('../data/Dresden_01-03.24/2024-01-31/2024-01-31-08-15-00.rds') # Wednesday
df_rushhour = result[None]

In [None]:
len(df_rushhour)

In [None]:
df_rushhour

In [None]:
len(df_rushhour.station_number.unique())

In [None]:
df_rushhour[df_rushhour.active!=True]

In [None]:
ax = sns.heatmap(df_rushhour.isnull(), cbar=True, cmap="Greys")

In [None]:
df_rushhour[df_rushhour.station_number==43052]

In [None]:
df_rushhour[df_rushhour.maintenance==True]

In [None]:
df_rushhour[df_rushhour.station_number==43010]

# 2024-02-14

In [36]:
def get_filename(time):
    return f'../data/Dresden_01-03.24/{time.strftime("%Y-%m-%d")}/{time.strftime("%Y-%m-%d-%H-%M-00")}.rds'


In [None]:
start_time = datetime(2024, 2, 14, 0, 0, 0) # the second filename
end_time = datetime(2024, 2, 14, 23, 59, 0)

for current_time in tqdm(pd.date_range(start=start_time, end=end_time, freq="min")):
        current_filename = get_filename(current_time)
        try:
                df_current = pyreadr.read_r(current_filename)[None]
                non_active_bikes = len(df_current[df_current.active!=True])
                if non_active_bikes>0:
                        print(f"at time {current_time} are {non_active_bikes} not active bikes")
                
        except Exception as e:
                pass

conclusion: there are no nnot-active biikes on 14th Feb