# Scooter Exploratory Data

**Questions to Answer:**

* Are there any null values in any columns in either table?
    * yes, nulls in the scooters/chargelevel - 770 total

*  What date range is represented in each of the date columns? Investigate any values that seem odd.
    * May 1st 2019 to July 31, 2019

* Is time represented with am/pm or using 24 hour values in each of the columns that include time?
    * 24 Hour Values

* What values are there in the sumdgroup column? Are there any that are not of interest for this project?
    * Scooter, scooter, and bicycle. Bikes are not needed for this project, I assume.

* What are the minimum and maximum values for all the latitude and longitude columns? Do these ranges make sense, or is there anything surprising? 
    * The Minimum Latitude and Maximum Longitude are both 0.0, which would put them on the equator or prime meridian respectably. This makes me wonder if the centerpoint is based on some decided point as opposed to geographic points.

* What is the range of values for trip duration and trip distance? Do these values make sense? Explore values that might seem questionable.

* Check out how the values for the company name column in the scooters table compare to those of the trips table. What do you notice?

In [1]:
from sqlalchemy import create_engine

In [2]:
import pandas as pd

In [3]:
database_name = 'scooters'    # Fill this in with your scooter database name

connection_string = f"postgresql://postgres:postgres@localhost:5432/{database_name}"

In [4]:
engine = create_engine(connection_string)

In [6]:
query = '''
SELECT DISTINCT(companyname)
FROM scooters;
'''
result = engine.execute(query)

In [7]:
result.fetchall()

[('Bird',), ('Bolt',), ('Gotcha',), ('Jump',), ('Lime',), ('Lyft',), ('Spin',)]

## Companies Involved Are:
* Bird
* Bolt
* Gotcha
* Jump
* Lime
* Lyft
* Spin

In [8]:
query = '''
SELECT MAX(startdate), MIN(startdate)
FROM trips
'''
result = engine.execute(query)

In [9]:
result.fetchall()

[(datetime.date(2019, 7, 31), datetime.date(2019, 5, 1))]

## Dates pulled are from May 1st 2019 to July 31, 2019

In [10]:
query = '''
SELECT starttime
FROM trips
ORDER BY starttime DESC
LIMIT 5;
'''
result = engine.execute(query)

In [11]:
result.fetchall()

[(datetime.time(23, 59, 59, 506666),),
 (datetime.time(23, 59, 59, 286666),),
 (datetime.time(23, 59, 59, 56666),),
 (datetime.time(23, 59, 59),),
 (datetime.time(23, 59, 59),)]

## Nulls Found In:
* Scooters
    * chargelevel - 770 nulls
* trips
    * pubdatetime - didn't run 

In [12]:
query = '''
SELECT COUNT(*)
FROM scooters
WHERE chargelevel IS NULL;
'''

result = engine.execute(query)

result.fetchall()

[(770,)]

## Types of Vehicles 

In [13]:
query = '''
SELECT DISTINCT(sumdgroup)
FROM scooters;
'''

result = engine.execute(query)

result.fetchall()

[('bicycle',), ('scooter',), ('Scooter',)]

## Min & Max Lng + Lat

In [14]:
query = '''
SELECT MAX(latitude), MIN(latitude)
FROM scooters;
'''

result = engine.execute(query)

result.fetchall()

[(Decimal('3609874.116666'), Decimal('0.000000'))]

In [15]:
query = '''
SELECT MAX(longitude), MIN(longitude)
FROM scooters;
'''

result = engine.execute(query)

result.fetchall()

[(Decimal('0.000000'), Decimal('-97.443879'))]