# Welcome to OSMNxMapping ☀️!

In this Jupyter notebook, we'll explore advanced pipelining techniques using the `UrbanPipeline` class in OSMNxMapping. We'll work with taxi trip data from New York City, enriching a Manhattan street network with multiple aggregations: the count of trips and the average tip amount per street segment. This example demonstrates how to perform multiple singular enrichments in a single pipeline, showcasing the flexibility and power of OSMNxMapping.

**Goals**:
- Import the OSMNxMapping library and necessary modules.
- Initialise an OSMNxMapping instance.
- Build a pipeline that includes loading, preprocessing, multiple singular enrichments, and visualisation.
- Execute the pipeline using `compose_transform`.
- Visualise the enriched network with multiple attributes.
- Understand how to handle multiple enrichments in a single workflow.

Unlike previous notebooks, we won’t use Auctus here—data must be available locally in CSV, Shapefile, or Parquet format. We'll use a sample CSV file (`taxis.csv`). For foundational steps or alternative approaches, refer to the `1-OSMNX_MAPPING_with_Auctus_basics` notebook.

Let’s get started! 🚀

## Step 1: Import the Library and Modules

We start by importing the `osmnx_mapping` library and the necessary modules for building our pipeline. This includes classes for the network, data loader, preprocessors, enrichers, and visualisers.


In [1]:
import osmnx_mapping as oxm
from osmnx_mapping.pipeline import UrbanPipeline
from osmnx_mapping.modules.network import OSMNxNetwork
from osmnx_mapping.modules.loader import CSVLoader
from osmnx_mapping.modules.preprocessing import CreatePreprocessor
from osmnx_mapping.modules.enricher import CreateEnricher
from osmnx_mapping.modules.visualiser import InteractiveVisualiser


## Step 2: Initialise an OSMNxMapping Instance

We create an instance of `OSMNxMapping` named `taxi_trips`. This instance will manage our pipeline and urban data analysis.


In [2]:
taxi_trips = oxm.OSMNxMapping()

## Step 3: Build the Urban Pipeline

We construct an `UrbanPipeline` with a series of steps to process our taxi trip data and enrich the Manhattan street network:

- **Network**: Query a drive network for Manhattan, NYC using `OSMNxNetwork`.
- **Loader**: Load the taxi trip data from a CSV file using `CSVLoader`.
- **Impute**: Handle missing geospatial data using `SimpleGeoImputer`.
- **Filter**: Retain only data points within the network's bounding box using `BoundingBoxFilter`.
- **Enrich trip_count**: Calculate the count of trips per street segment.
- **Enrich avg_tip_amount**: Calculate the average tip amount per street segment.
- **Visualise**: Set up an `InteractiveVisualiser` for interactive visualisation.

> **Note**: Ensure the file path to the CSV file is correct and that the column names match your dataset.


In [None]:
pipeline_csv = UrbanPipeline([
    ("network", OSMNxNetwork(place_name="Manhattan, NYC", network_type="drive")),
    ("loader", CSVLoader(file_path="./taxis.csv")),
    ("impute", CreatePreprocessor().with_imputer(
        imputer_type="SimpleGeoImputer",
    ).build()),
    ("filter", CreatePreprocessor().with_filter(
        filter_type="BoundingBoxFilter"
    ).build()),
    ("enrich avg_trip_count", CreateEnricher()
        .with_data(group_by="nearest_node")
        .count_by(edge_method="average", output_column="avg_trip_count")
        .build()
    ),
    ("enrich avg_tip_amount", CreateEnricher()
        .with_data(group_by="nearest_node", values_from="tip_amount")
        .aggregate_with(method="mean", output_column="avg_tip_amount")
        .build()
    ),
    ("viz", InteractiveVisualiser())
])


### Understanding the Enrichment Steps

In this pipeline, we perform two distinct enrichments on the network:

1. **Enrich avg_trip_count**:
   - We use `CreateEnricher().with_data(group_by="nearest_node")` to group the data by the nearest node in the network.
   - Then, `.count_by(edge_method="average", output_column="avg_trip_count")` counts the number of trips associated with each node and averages this count across the edges connected to each node. This gives us a measure of trip density per street segment.

2. **Enrich avg_tip_amount**:
   - Similarly, `CreateEnricher().with_data(group_by="nearest_node", values_from="tip_amount")` groups the data by the nearest node and considers the `tip_amount` values.
   - `.aggregate_with(method="mean", output_column="avg_tip_amount")` calculates the mean tip amount for trips associated with each node. Since we didn't specify an `edge_method`, it defaults to assigning the node value to the edges in a suitable way (e.g., averaging between connected nodes).

These enrichments allow us to analyse different aspects of the taxi trip data on the street network simultaneously.


## Step 4: Execute the Pipeline

We execute the pipeline using `compose_transform`, which runs all the defined steps in sequence. We specify the latitude and longitude column names from the dataset to ensure correct geospatial processing.

This step returns the processed data, enriched graph, nodes, and edges.


In [4]:
data_csv, graph_csv, nodes_csv, edges_csv = pipeline_csv.compose_transform(
    latitude_column_name="pickup_latitude",
    longitude_column_name="pickup_longitude"
)


## Step 5: Visualise the Enriched Network

Finally, we visualise the enriched network using the `visualise` method of the pipeline. We specify the columns to visualise (`avg_trip_count` and `avg_tip_amount`) and choose a colormap for the visualisation.

The `InteractiveVisualiser` will display an interactive map, allowing you to explore both enriched attributes.

> **Note**: Ensure that the Jupyter extensions for interactive visualisation are installed as per the library's installation instructions.


In [7]:
viz_csv = pipeline_csv.visualise(result_columns=["avg_trip_count", "avg_tip_amount"], colormap="Greens")
viz_csv


VBox(children=(Dropdown(description='Viz. with:', options=('avg_trip_count', 'avg_tip_amount'), value='avg_tri…

## Conclusion

Congratulations! 🎉 You've successfully built an advanced urban pipeline using OSMNxMapping to enrich a Manhattan street network with multiple attributes from taxi trip data. This example demonstrates the power of pipelining for handling complex urban data workflows efficiently.

For more details on each component or to explore other features, refer to the OSMNxMapping API documentation and the `examples/` directory.

Happy urban mapping! 🌆
