# Part 2: Indego Data Analysis with SQL

Jump to a specific section:
<br> *(view thru [nbviewer](https://nbviewer.org/github/MichaelStinson/Indego-City-Bikes-with-Google-Maps-API-using-Python/blob/main/Part%202%20Indego%20Data%20Analysis%20with%20SQL.ipynb) to use jump to links below)*<br>
- [Data overview](#Data-overview)
- [Analyze station data](#Analyze-station-data)
- [Join trip data with distance data](#Join-trip-data-with-distance-data)

## Data overview
[Return to top](#Part-2:-Indego-Data-Analysis-with-SQL)

<br>Load the db

In [5]:
%%capture
%load_ext sql
%sql sqlite:///bike_trip_db

<br>Print the list of tables

In [6]:
%%sql
SELECT *
  FROM sqlite_master
 WHERE type='table';

 * sqlite:///bike_trip_db
Done.


type,name,tbl_name,rootpage,sql
table,stations,stations,2,"CREATE TABLE ""stations"" ( ""station_id"" INTEGER,  ""station_name"" TEXT,  ""day_of_go_live_date"" TIMESTAMP,  ""status"" TEXT,  ""neighborhood"" TEXT,  ""station_lat"" REAL,  ""station_lon"" REAL )"
table,station_combos,station_combos,45771,"CREATE TABLE ""station_combos"" ( ""start_name"" TEXT,  ""end_name"" TEXT,  ""distance"" REAL )"
table,trips,trips,45773,"CREATE TABLE ""trips"" ( ""trip_id"" INTEGER,  ""duration"" INTEGER,  ""start_time"" TIMESTAMP,  ""end_time"" TIMESTAMP,  ""start_station"" INTEGER,  ""end_station"" INTEGER,  ""bike_id"" TEXT,  ""plan_duration"" REAL,  ""trip_route_category"" TEXT,  ""passholder_type"" TEXT,  ""bike_type"" TEXT )"


<br>Print the first few rows from each table

In [7]:
%%sql
SELECT * 
FROM stations
LIMIT 5

 * sqlite:///bike_trip_db
Done.


station_id,station_name,day_of_go_live_date,status,neighborhood,station_lat,station_lon
3000,Virtual Station,2015-04-23 00:00:00,Active,Center City,,
3004,Municipal Services Building Plaza,2015-04-23 00:00:00,Active,Center City,39.953781,-75.163742
3005,"Welcome Park, NPS",2015-04-23 00:00:00,Active,Center City East,39.94733,-75.144028
3006,40th & Spruce,2015-04-23 00:00:00,Active,University City,39.952202,-75.20311
3007,"11th & Pine, Kahn Park",2015-04-23 00:00:00,Active,Washington Square West,39.945171,-75.159927


In [8]:
%%sql
SELECT COUNT(*) 
FROM stations
LIMIT 5

 * sqlite:///bike_trip_db
Done.


COUNT(*)
180


In [9]:
%%sql
SELECT * 
FROM trips
LIMIT 5

 * sqlite:///bike_trip_db
Done.


trip_id,duration,start_time,end_time,start_station,end_station,bike_id,plan_duration,trip_route_category,passholder_type,bike_type
306773863,8,2019-01-01 00:19:00,2019-01-01 00:27:00,3049,3007,14495,30.0,One Way,Indego30,standard
306773862,7,2019-01-01 00:30:00,2019-01-01 00:37:00,3005,3007,5332,1.0,One Way,Day Pass,standard
306773861,13,2019-01-01 00:52:00,2019-01-01 01:05:00,3166,3169,14623,30.0,One Way,Indego30,standard
306773860,9,2019-01-01 00:55:00,2019-01-01 01:04:00,3058,3103,11706,30.0,One Way,Indego30,standard
306773859,12,2019-01-01 01:05:00,2019-01-01 01:17:00,3182,3028,11039,30.0,One Way,Indego30,standard


In [10]:
%%sql
SELECT COUNT(*)
FROM trips

 * sqlite:///bike_trip_db
Done.


COUNT(*)
1989934


In [11]:
%%sql
SELECT * from station_combos
LIMIT 5

 * sqlite:///bike_trip_db
Done.


start_name,end_name,distance
10th & Chestnut,10th & Federal,1.1
10th & Chestnut,11th & Market,0.4
10th & Chestnut,"11th & Pine, Kahn Park",0.6
10th & Chestnut,"11th & Poplar, John F. Street Community Center",1.6
10th & Chestnut,11th & Reed,1.3


In [12]:
%%sql
SELECT *
FROM station_combos
LIMIT 5

 * sqlite:///bike_trip_db
Done.


start_name,end_name,distance
10th & Chestnut,10th & Federal,1.1
10th & Chestnut,11th & Market,0.4
10th & Chestnut,"11th & Pine, Kahn Park",0.6
10th & Chestnut,"11th & Poplar, John F. Street Community Center",1.6
10th & Chestnut,11th & Reed,1.3


## Analyze station data
[Return to top](#Part-2:-Indego-Data-Analysis-with-SQL)

<br>Get a list of all neighborhoods and the count of Indego stations

In [30]:
%%sql
SELECT neighborhood, COUNT(*) count
FROM stations
GROUP BY neighborhood
ORDER BY count DESC;

 * sqlite:///bike_trip_db
Done.


neighborhood,count
North Philadelphia,35
University City,25
Center City,21
Center City West,12
Center City East,11
West Philadelphia,9
Rittenhouse Square,6
Point Breeze,6
Graduate Hospital,6
Washington Square West,4


<br>See how many stations are in Center City, inclusive of those with East/West suffixes

In [20]:
%%sql
SELECT COUNT(*) count
FROM stations
WHERE neighborhood LIKE 'Center City%'

 * sqlite:///bike_trip_db
Done.


count
44


<br>Are any stations inactive?

In [23]:
%%sql
SELECT *
FROM stations
WHERE status = "Inactive"

 * sqlite:///bike_trip_db
Done.


station_id,station_name,day_of_go_live_date,status,neighborhood,station_lat,station_lon
3023,Rittenhouse Square,2015-04-23 00:00:00,Inactive,Rittenhouse Square,,
3027,"40th Street Station, MFL",2015-04-23 00:00:00,Inactive,University City,39.95694,-75.200691
3036,2nd & Germantown,2015-04-23 00:00:00,Inactive,North Philadelphia,39.968441,-75.140007
3038,The Children's Hospital of Philadelphia (CHOP),2015-04-23 00:00:00,Inactive,University City,39.947811,-75.194092
3048,Broad & Fitzwater,2015-04-23 00:00:00,Inactive,South Philadelphia,,
3095,29th & Diamond,2016-04-28 00:00:00,Inactive,North Philadelphia,39.987709,-75.180519
3103,"27th & Master, Athletic Recreation Center",2016-05-03 00:00:00,Inactive,North Philadelphia,39.977139,-75.179398
3105,Penn Treaty Park,2016-05-03 00:00:00,Inactive,North Philadelphia,39.96207,-75.141113
3109,Parkside & Girard,2016-05-06 00:00:00,Inactive,East Parkside,,
3122,"24th & Cecil B. Moore, Cecil B. Moore Library",2016-04-27 00:00:00,Inactive,North Philadelphia,,


<br>Find the top ten neighborhoods by active stations. Treat all Center City stations as a single neighborhood

In [31]:
%%sql
WITH stations_updated_neighborhoods AS (
    SELECT *, 
    CASE 
        WHEN neighborhood LIKE 'Center City%' THEN 'Center City'
        ELSE neighborhood
    END updated_neighborhood
    FROM stations
)
SELECT 
    updated_neighborhood, 
    COUNT(*) count
FROM stations_updated_neighborhoods
WHERE status = 'Active'
GROUP BY updated_neighborhood
ORDER BY count DESC
LIMIT 10;

 * sqlite:///bike_trip_db
Done.


updated_neighborhood,count
Center City,42
North Philadelphia,30
University City,22
West Philadelphia,9
Point Breeze,6
Rittenhouse Square,5
Graduate Hospital,5
Washington Square West,4
South Philadelphia East,4
West Poplar,3


## Join trip data with distance data
[Return to top](#Part-2:-Indego-Data-Analysis-with-SQL)

In [55]:
%%sql
CREATE VIEW trips_with_distance AS

WITH station_combos_with_ids AS (
    SELECT sc.*, s1.station_id start_id, s2.station_id end_id 
    FROM station_combos sc
    LEFT JOIN stations s1
    ON sc.start_name = s1.station_name
    LEFT JOIN stations s2
    ON sc.end_name = s2.station_name
)

    SELECT t.*, scwi.distance distance 
    FROM trips t 
    LEFT JOIN station_combos_with_ids scwi
    ON t.start_station = scwi.start_id AND t.end_station = scwi.end_id;

SELECT * FROM trips_with_distance
LIMIT 10

 * sqlite:///bike_trip_db
Done.
Done.


trip_id,duration,start_time,end_time,start_station,end_station,bike_id,plan_duration,trip_route_category,passholder_type,bike_type,distance
306773863,8,2019-01-01 00:19:00,2019-01-01 00:27:00,3049,3007,14495,30.0,One Way,Indego30,standard,1.3
306773862,7,2019-01-01 00:30:00,2019-01-01 00:37:00,3005,3007,5332,1.0,One Way,Day Pass,standard,1.2
306773861,13,2019-01-01 00:52:00,2019-01-01 01:05:00,3166,3169,14623,30.0,One Way,Indego30,standard,1.6
306773860,9,2019-01-01 00:55:00,2019-01-01 01:04:00,3058,3103,11706,30.0,One Way,Indego30,standard,1.3
306773859,12,2019-01-01 01:05:00,2019-01-01 01:17:00,3182,3028,11039,30.0,One Way,Indego30,standard,1.9
306773858,13,2019-01-01 01:07:00,2019-01-01 01:20:00,3046,3075,3628,30.0,One Way,Indego30,standard,1.9
306773857,10,2019-01-01 01:13:00,2019-01-01 01:23:00,3045,3155,2474,30.0,One Way,Indego30,standard,1.0
306773855,5,2019-01-01 01:14:00,2019-01-01 01:19:00,3115,3123,11913,30.0,One Way,Indego30,standard,0.7
306773856,18,2019-01-01 01:14:00,2019-01-01 01:32:00,3068,3007,5224,30.0,One Way,Indego30,standard,1.0
306773854,6,2019-01-01 01:26:00,2019-01-01 01:32:00,3022,3165,2498,30.0,One Way,Indego30,standard,0.7
