# Uber pickups — quick exploratory analysis

This notebook loads a public Uber pickups CSV (April 2014) and shows a short EDA: head, basic stats, pickups by hour, and a simple map scatter (if running locally with matplotlib). Designed to be runnable immediately in Colab / local Jupyter.

In [None]:
# Install any missing packages (uncomment if needed)
# !pip install -q pandas matplotlib seaborn


In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style='whitegrid')

# Public CSV (April 2014) from a public repo
url = 'https://raw.githubusercontent.com/fivethirtyeight/uber-tlc-foil-response/master/uber-trip-data/uber-raw-data-apr14.csv'

# Load
df = pd.read_csv(url)
df.columns = [c.strip() for c in df.columns]
display(df.head())
print('\nRows, cols:', df.shape)

# Parse datetime column (column is named 'Date/Time' in that CSV)
if 'Date/Time' in df.columns:
    df['Date/Time'] = pd.to_datetime(df['Date/Time'])
    df['hour'] = df['Date/Time'].dt.hour
    df['date'] = df['Date/Time'].dt.date
else:
    print('Expected column "Date/Time" not found; available columns:', df.columns.tolist())


In [None]:
# Quick aggregated plots
plt.figure(figsize=(10,4))
if 'hour' in df.columns:
    hourly = df.groupby('hour').size()
    sns.barplot(x=hourly.index, y=hourly.values, palette='viridis')
    plt.xlabel('Hour of day')
    plt.ylabel('Number of pickups')
    plt.title('Uber pickups by hour (Apr 2014)')
else:
    plt.text(0.5, 0.5, 'No hour column', ha='center')
plt.show()

# Top 5 dates
if 'date' in df.columns:
    top_dates = df['date'].value_counts().head(10)
    print('\nTop pickup dates (count):')
    print(top_dates)


In [None]:
# Simple scatter of lat/lon (may be dense); zoom with alpha
if {'Lat','Lon'}.issubset(set(df.columns)):
    plt.figure(figsize=(6,6))
    plt.scatter(df['Lon'], df['Lat'], s=1, alpha=0.3)
    plt.title('Pickup locations (Lon/Lat)')
    plt.xlabel('Longitude')
    plt.ylabel('Latitude')
    plt.show()
else:
    print('No Lat/Lon columns available for map scatter.')


## Next steps / reproducibility

1. If you want this notebook placed into your public repo `DataEngineering`, either push it yourself (instructions below) or give me permission and I will push it.
2. If the original ShivramSriramulu/Uber-analysis repo contains different or additional data, upload those CSVs into the repo and adjust the `url` or path in the first code cell.

### Quick commands to run notebook locally or in Colab:
- Upload this notebook to Colab (File -> Upload notebook) and run, or run locally with: jupyter notebook
