# **Customer Orders Analysis Script** #

## **Overview** ##

This Python script reads an input CSV file containing order data in batches and performs various analyses including aggregating spending per customer, calculating cost per mile for deliveries, and understanding customer distribution based on sources and destinations.

## **Dependencies** ##
To run this script, the following Python libraries are required:

* `pandas`
* `geopy`

Install these dependencies in terminal by using following codes：

* `python -m pip install pandas`
* `python -m pip install geopy`

##  **Description** ##

The script performs the following tasks:
* Aggregates customer spending: Sums up the payment amounts for each customer.
* Counts occurrences: Counts the number of occurrences of each source and destination to understand customer distribution.
* Retrieves coordinates: Uses the Nominatim API to get latitude and longitude for source and destination locations.
* Calculates distances: Computes distances between source and destination coordinates using the geodesic function from the geopy library.
* Calculates cost per mile: Determines the cost per mile for each delivery

## **Usage** ##

**1. Input CSV File**
   Input CSV file named orders.csv with the following columns:
* order_id: Unique identifier for each order.
* source: The source location.
* destination: The destination location.
* customer_id: Unique identifier for each customer.
* customer_payment_amount: The amount spent by the customer. 
* item_weight: The weight of the item being shipped.

**2. Script Execution**
   The script can be executed by running the following code in your terminal:
* `python script_location.py`

## **Examples for Usage** ##

**1. Input CSV Files**
![image.png](attachment:f209c83b-3a9d-4455-877e-d7860aa77b11.png)

**2. Script Execution**
![image.png](attachment:0d9df98c-2253-46e5-8e27-ba594977bef8.png)

**3. Output**
![image.png](attachment:cc0ab76f-c0d0-4913-8355-3a6517a55617.png)

## **Parameters** ##

* `chunksize`: The size of the batch to read from the CSV file at a time (default is set to 100).
* `chunk`: A chunk of the orders DataFrame.
* `lists`: List of elements to count.
* `locations`: List of location names.
* `source_coords`: List of tuples containing source coordinates (latitude, longitude).
* `destination_coords`: List of tuples containing destination coordinates (latitude, longitude).
* `payment_amounts`: List of customer payment amounts.
* `weights`: List of item weights.
* `distances`