## KPI (key performance indicators) 
are metrics you track to measure the success of your performance and whether you're achieving your objectives.

- They support your main goals.

- They answer two main questions: 
     - What are you trying to achieve?
     - What would success look like?

## S.M.A.R.T

- Specific
- Measurable
- Attainable
- Relevant
- Time-bound

Reduce classic bike trips without endstation with 10% in the next 3 month

- Specific: Classic Bike trips with no end station
- Measurable: 10%
- Attainable:  need investigation
- Relevant: Not to loose bikes
- Time-bound: 3 month



In [6]:
import pandas as pd
from zipfile import ZipFile
from io import BytesIO
import requests

url08 = 'https://s3.amazonaws.com/capitalbikeshare-data/202308-capitalbikeshare-tripdata.zip'
url09='https://s3.amazonaws.com/capitalbikeshare-data/202309-capitalbikeshare-tripdata.zip'
url10='https://s3.amazonaws.com/capitalbikeshare-data/202310-capitalbikeshare-tripdata.zip'

def import_csv_from_zip (url):
    r = requests.get(url)
    buf1 = BytesIO(r.content)
    with ZipFile(buf1, "r") as f:
        for name in f.namelist():
            if name.endswith('.csv'):
                with f.open(name) as zd:
                    df = pd.read_csv(zd)
                    return df
            

In [7]:
trip08=import_csv_from_zip(url08)
trip09=import_csv_from_zip(url09)
trip10=import_csv_from_zip(url10)


In [8]:


trip_import08= trip08.copy()
trip_import09= trip09.copy()
trip_import10= trip10.copy()



In [9]:
trip_import10.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 491084 entries, 0 to 491083
Data columns (total 13 columns):
 #   Column              Non-Null Count   Dtype  
---  ------              --------------   -----  
 0   ride_id             491084 non-null  object 
 1   rideable_type       491084 non-null  object 
 2   started_at          491084 non-null  object 
 3   ended_at            491084 non-null  object 
 4   start_station_name  434475 non-null  object 
 5   start_station_id    434475 non-null  float64
 6   end_station_name    431026 non-null  object 
 7   end_station_id      431026 non-null  float64
 8   start_lat           491084 non-null  float64
 9   start_lng           491084 non-null  float64
 10  end_lat             490538 non-null  float64
 11  end_lng             490538 non-null  float64
 12  member_casual       491084 non-null  object 
dtypes: float64(6), object(7)
memory usage: 48.7+ MB


## filter lost classic bikes

In [10]:
def no_return_classic_df(df):
    # boolean mask for bikes with no end station
    no_end_trip = df['end_station_name'].isnull()
    # df with trips with no endstation
    trips_no_end_filtered = df[no_end_trip]
    # bike type for bikes with no end station
    by_bike_trips_no_end_filtered=trips_no_end_filtered['rideable_type'].value_counts()
    trips_no_end_classic = trips_no_end_filtered[trips_no_end_filtered['rideable_type']=='classic_bike']
    print(by_bike_trips_no_end_filtered)
    return trips_no_end_classic

no_return_classic_08=no_return_classic_df(trip_import08)
no_return_classic_09=no_return_classic_df(trip_import09)
no_return_classic_10=no_return_classic_df(trip_import10)





rideable_type
electric_bike    44459
classic_bike       438
docked_bike        186
Name: count, dtype: int64
rideable_type
electric_bike    46651
classic_bike       650
docked_bike         95
Name: count, dtype: int64
rideable_type
electric_bike    59395
classic_bike       663
Name: count, dtype: int64


In [11]:
no_return_classic_10.info()
no_return_classic_09.info()
no_return_classic_08.info()

<class 'pandas.core.frame.DataFrame'>
Index: 663 entries, 24954 to 491070
Data columns (total 13 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   ride_id             663 non-null    object 
 1   rideable_type       663 non-null    object 
 2   started_at          663 non-null    object 
 3   ended_at            663 non-null    object 
 4   start_station_name  662 non-null    object 
 5   start_station_id    662 non-null    float64
 6   end_station_name    0 non-null      object 
 7   end_station_id      0 non-null      float64
 8   start_lat           663 non-null    float64
 9   start_lng           663 non-null    float64
 10  end_lat             117 non-null    float64
 11  end_lng             117 non-null    float64
 12  member_casual       663 non-null    object 
dtypes: float64(6), object(7)
memory usage: 72.5+ KB
<class 'pandas.core.frame.DataFrame'>
Index: 650 entries, 1799 to 384872
Data columns (total 13 columns):

## find patterns on non returned bikes

In [12]:


def patter_no_return(df):
    # by unique member status
    member_type_no_return = df['member_casual'].value_counts()
    print('member', member_type_no_return)

    #by_unique start station
    start_no_return = df['start_station_name'].value_counts()
    print('start_station', start_no_return)

patter_no_return(no_return_classic_08)
patter_no_return(no_return_classic_09)
patter_no_return(no_return_classic_10)



member member_casual
casual    303
member    135
Name: count, dtype: int64
start_station start_station_name
Smithsonian-National Mall / Jefferson Dr & 12th St SW    8
17th & K St NW                                           6
Jefferson Dr & 14th St SW                                5
8th & H St NE                                            4
16th & Irving St NW                                      4
                                                        ..
Fenton St & New York Ave                                 1
Oklahoma Ave & D St NE                                   1
Georgia & New Hampshire Ave NW                           1
Pleasant St & MLK Ave SE                                 1
Montgomery Ave & Waverly St                              1
Name: count, Length: 281, dtype: int64
member member_casual
casual    395
member    255
Name: count, dtype: int64
start_station start_station_name
Columbus Circle / Union Station                   10
West Hyattsville Metro                     