# Python Tutorial: Consuming a REST API to Generate Mock Data for LCL Logistics

## Introduction to APIs

APIs (Application Programming Interfaces) are sets of protocols and tools for building software applications. They specify how software components should interact, allowing data to be exchanged between systems efficiently and securely. In this tutorial, we'll explore how to integrate a REST API using Python to generate mock data that could simulate various scenarios for logistics and transport operations at LCL Logistics.

## Practical Exercise: Consuming a REST API with Python

In this section, we'll use Python's `requests` library to consume an API that generates mock data, simulating payloads for logistics operations. This can be useful for testing without using real, sensitive data from the company. We'll be simulating data like shipment routes, temperatures, and delivery times, which are critical in the transport of perishables.

### 1. Importing Necessary Libraries
We begin by importing the `requests` library, which will enable us to send HTTP requests.


In [6]:
import requests
import os
from dotenv import load_dotenv

load_dotenv()

True

### 2. Interpreting cURL Calls for Python Requests

In this section, we'll cover how to interpret and convert a cURL call into a Python `requests` call. cURL commands are often used for interacting with APIs directly from the command line. However, when building applications in Python, it's beneficial to translate these into Python code using the `requests` library.

#### Example cURL Call:
Here's the cURL call provided:

```bash
curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      },
      {
        "role": "assistant",
        "content": "The Los Angeles Dodgers won the World Series in 2020."
      },
      {
        "role": "user",
        "content": "Where was it played?"
      }
    ]
  }'
```

### Step-by-Step Conversion to Python Requests

#### Step 1: Setting the URL
First, identify the URL from the cURL command where the request is being sent.

```python
url = 'https://api.openai.com/v1/chat/completions'
```

#### Step 2: Adding Headers
Extract the headers used in the cURL call and prepare them for the Python request.

```python
headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer YOUR_API_KEY'  # Replace YOUR_API_KEY with your actual OpenAI API key
}
```

#### Step 3: Preparing the Data Payload
The data payload (`-d`) from the cURL is a JSON object, which you need to convert into a Python dictionary.

```python
data = {
    'model': 'gpt-3.5-turbo',
    'messages': [
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'user', 'content': 'Who won the world series in 2020?'},
        {'role': 'assistant', 'content': 'The Los Angeles Dodgers won the World Series in 2020.'},
        {'role': 'user', 'content': 'Where was it played?'}
    ]
}
```

#### Step 4: Making the POST Request
Utilize the `requests.post` method to send the request to the API.

```python
response = requests.post(url, headers=headers, json=data)
```

#### Step 5: Handling the Response
Finally, handle the response from the API. Check the status code to ensure it was successful and print the response data.

```python
if response.status_code == 200:
    print("Success! Response data:", response.json())
else:
    print("Failed to retrieve data, status code:", response.status_code)
```



### 3. Setting Up the API Request
Next, we will define the parameters for our API request to generate mock data. We will use the OpenAI's ChatGPT model through a REST API to create mock logistic data scenarios.


In [7]:
# API URL
url = 'https://api.openai.com/v1/chat/completions'

# Headers with Authorization (replace 'YOUR_API_KEY' with your actual OpenAI API key)
headers = {
    'Authorization': f'Bearer {os.environ["OPEN_AI_KEY"]}',
    'Content-Type': 'application/json',
}

# JSON payload specifying the model, prompt, and other parameters
data = {
    'model': 'gpt-3.5-turbo',
    'messages': [{'role': 'user', 'content': 'Generate a mock data for a logistic company handling perishable goods. The data should be customer information. Only return the JSON'}]
}


### 4. Sending the Request and Handling the Response
We will now send a POST request to the API with the headers and data we defined. We'll then handle the response to extract the generated mock data.


In [8]:
response = requests.post(url, headers=headers, json=data)

# Checking if the request was successful
if response.status_code == 200:
    print("Data retrieved successfully!")
    # Extracting and printing the generated data
    mock_data = response.json()['choices'][0]['message']['content']
    print("Generated Mock Data:", mock_data)
else:
    print("Failed to retrieve data, status code:", response.status_code)


Data retrieved successfully!
Generated Mock Data: {
  "customers": [
    {
      "customer_id": 1,
      "name": "John Doe",
      "email": "johndoe@example.com",
      "phone": "123-456-7890",
      "address": "123 Main St, Anytown, USA",
      "order_id": "ORD1234",
      "delivery_date": "2022-05-15",
      "perishable_goods": [
        {
          "item_id": 1,
          "item_name": "Fresh Produce",
          "quantity": 50,
          "temperature_control": "Yes",
          "expiry_date": "2022-05-20"
        },
        {
          "item_id": 2,
          "item_name": "Dairy Products",
          "quantity": 30,
          "temperature_control": "Yes",
          "expiry_date": "2022-05-18"
        }
      ]
    },
    {
      "customer_id": 2,
      "name": "Jane Smith",
      "email": "janesmith@example.com",
      "phone": "234-567-8901",
      "address": "456 Oak St, Smallville, USA",
      "order_id": "ORD5678",
      "delivery_date": "2022-05-20",
      "perishable_goods": [
  

To ensure that the mock data for a logistics company handling perishable goods is output as per your example, you'll need to modify the prompt that is sent to the API. This modification must specify exactly what type of data you require, structured in a way that mimics the desired output format. Here's how you can adjust the API request to suit your needs:

### 5. Updated Prompt for Generating Specific Mock Data
To ensure the mock data is generated as you specified, the prompt must clearly outline the structure and fields required. Here’s an example of how you can set up your API request to achieve this:



In [10]:

# Define the API request payload with a detailed prompt
data = {
    'model': 'gpt-3.5-turbo',
    'messages': [
        {
            'role': 'user',
            'content': """
            Generate mock data for a logistic company handling perishable goods. 
            The data should include customer information formatted as JSON. 
            Each customer should have the following details:
            - Customer ID
            - Name
            - Email
            - Phone number
            - Address
            - Order ID
            - Delivery date
            - A list of perishable goods, with each item including:
              - Item ID
              - Item name
              - Quantity
              - Whether temperature control is required
              - Expiry date
            Return the data in JSON format as shown in the provided example.
            """
        }
    ]
}



This updated prompt is now more specific, clearly requesting the structure of the output and defining all necessary fields, mimicking your example. This is critical to ensure the AI understands and produces the expected mock data format for your logistics applications.

In [11]:
response = requests.post(url, headers=headers, json=data)

# Checking if the request was successful
if response.status_code == 200:
    print("Data retrieved successfully!")
    # Extracting and printing the generated data
    mock_data = response.json()['choices'][0]['message']['content']
    print("Generated Mock Data:", mock_data)
else:
    print("Failed to retrieve data, status code:", response.status_code)

Data retrieved successfully!
Generated Mock Data: {
    "customers": [
        {
            "customer_id": "C0001",
            "name": "Alice Smith",
            "email": "alice.smith@example.com",
            "phone_number": "123-456-7890",
            "address": "123 Main Street, Anytown, USA",
            "order_id": "O0001",
            "delivery_date": "2022-07-05",
            "perishable_goods": [
                {
                    "item_id": "I0001",
                    "item_name": "Apples",
                    "quantity": 10,
                    "temperature_control_required": true,
                    "expiry_date": "2022-07-10"
                },
                {
                    "item_id": "I0002",
                    "item_name": "Milk",
                    "quantity": 2,
                    "temperature_control_required": true,
                    "expiry_date": "2022-07-07"
                }
            ]
        },
        {
            "customer_id": "C0002",

### 6. Parsing API Mock Data into DataFrames with Python

In this section of our tutorial, we'll learn how to parse the JSON mock data received from our API into two separate DataFrames using the pandas library in Python. We will create one DataFrame for customer information and another for perishable goods, linking them using the customer ID as a key.

#### Importing Necessary Libraries
Before we start parsing the data, we need to ensure that we have the necessary Python libraries installed and imported. We'll primarily use `pandas` for data manipulation and `json` for handling JSON data.


In [12]:
import pandas as pd
import json


#### Loading the Mock Data
Assuming the mock data is received as a JSON formatted string from the API, we first convert this string into a Python dictionary using the `json.loads()` method.


In [13]:
# Mock JSON response from the API
json_data = """
{
  "customers": [
    {
      "customer_id": 1,
      "name": "John Doe",
      "email": "johndoe@example.com",
      "phone": "123-456-7890",
      "address": "123 Main St, Anytown, USA",
      "order_id": "ORD1234",
      "delivery_date": "2022-05-15",
      "perishable_goods": [
        {
          "item_id": 1,
          "item_name": "Fresh Produce",
          "quantity": 50,
          "temperature_control": "Yes",
          "expiry_date": "2022-05-20"
        },
        {
          "item_id": 2,
          "item_name": "Dairy Products",
          "quantity": 30,
          "temperature_control": "Yes",
          "expiry_date": "2022-05-18"
        }
      ]
    },
    {
      "customer_id": 2,
      "name": "Jane Smith",
      "email": "janesmith@example.com",
      "phone": "234-567-8901",
      "address": "456 Oak St, Smallville, USA",
      "order_id": "ORD5678",
      "delivery_date": "2022-05-20",
      "perishable_goods": [
        {
          "item_id": 3,
          "item_name": "Frozen Seafood",
          "quantity": 20,
          "temperature_control": "Yes",
          "expiry_date": "2022-05-23"
        },
        {
          "item_id": 4,
          "item_name": "Fresh Meat",
          "quantity": 15,
          "temperature_control": "Yes",
          "expiry_date": "2022-05-21"
        }
      ]
    }
  ]
}
"""

# Convert JSON string to a dictionary
data_dict = json.loads(json_data)



#### Creating DataFrames
We'll now create two separate DataFrames: one for customers and another for perishable goods. We'll use a nested list comprehension for the perishable goods to ensure they are properly linked to their respective customers via the customer ID.

#### Customer DataFrame


In [15]:
# Extract customer information and create a DataFrame
customers_df = pd.DataFrame(data_dict['customers'])
print(customers_df)


   customer_id        name                  email         phone  \
0            1    John Doe    johndoe@example.com  123-456-7890   
1            2  Jane Smith  janesmith@example.com  234-567-8901   

                       address order_id delivery_date  \
0    123 Main St, Anytown, USA  ORD1234    2022-05-15   
1  456 Oak St, Smallville, USA  ORD5678    2022-05-20   

                                    perishable_goods  
0  [{'item_id': 1, 'item_name': 'Fresh Produce', ...  
1  [{'item_id': 3, 'item_name': 'Frozen Seafood',...  


#### Perishable Goods DataFrame
Extract the nested 'perishable_goods' and flatten them into a separate DataFrame. Ensure each item includes the customer ID.



In [16]:
# Extract perishable goods and include customer ID
goods_list = [
    {**item, 'customer_id': customer['customer_id']}
    for customer in data_dict['customers']
    for item in customer['perishable_goods']
]

# Create a DataFrame for perishable goods
goods_df = pd.DataFrame(goods_list)
print(goods_df)


   item_id       item_name  quantity temperature_control expiry_date  \
0        1   Fresh Produce        50                 Yes  2022-05-20   
1        2  Dairy Products        30                 Yes  2022-05-18   
2        3  Frozen Seafood        20                 Yes  2022-05-23   
3        4      Fresh Meat        15                 Yes  2022-05-21   

   customer_id  
0            1  
1            1  
2            2  
3            2  


###  Linking DataFrames
The `customer_id` in both DataFrames allows us to link these tables. This is useful for operations that involve joining data based on customer relationships, such as merging tables or performing group-wise analysis.

In [17]:
customers_df.head()

Unnamed: 0,customer_id,name,email,phone,address,order_id,delivery_date,perishable_goods
0,1,John Doe,johndoe@example.com,123-456-7890,"123 Main St, Anytown, USA",ORD1234,2022-05-15,"[{'item_id': 1, 'item_name': 'Fresh Produce', ..."
1,2,Jane Smith,janesmith@example.com,234-567-8901,"456 Oak St, Smallville, USA",ORD5678,2022-05-20,"[{'item_id': 3, 'item_name': 'Frozen Seafood',..."


In [18]:
goods_df.head()

Unnamed: 0,item_id,item_name,quantity,temperature_control,expiry_date,customer_id
0,1,Fresh Produce,50,Yes,2022-05-20,1
1,2,Dairy Products,30,Yes,2022-05-18,1
2,3,Frozen Seafood,20,Yes,2022-05-23,2
3,4,Fresh Meat,15,Yes,2022-05-21,2
