Step 1: Install Required Libraries
If you haven’t already, install the necessary packages:

In [1]:
%pip install pandas geopy

Collecting geopy
  Downloading geopy-2.4.1-py3-none-any.whl.metadata (6.8 kB)
Collecting geographiclib<3,>=1.52 (from geopy)
  Downloading geographiclib-2.0-py3-none-any.whl.metadata (1.4 kB)
Downloading geopy-2.4.1-py3-none-any.whl (125 kB)
Downloading geographiclib-2.0-py3-none-any.whl (40 kB)
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-2.0 geopy-2.4.1
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: C:\Users\I841302\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Step 2: Read and Clean the CSV

In [None]:
import pandas as pd

# Load the CSV, skipping malformed lines
df = pd.read_csv("data/new_product.csv", delimiter=";", quotechar='"', engine='python', on_bad_lines='skip')
# df = df.str.rstrip(',')
# Optional: preview the first few rows
print(df.head())

  PRODUCT_ID                                       PRODUCT_NAME  \
0     P_0110  Logitech Signature M650 L Full Size Wireless M...   
1     P_0138     Logitech G305 LIGHTSPEED Wireless Gaming Mouse   
2     P_0053                       Logitech M185 Wireless Mouse   
3     P_0125                              HP 150 Wireless Mouse   
4     P_0118                                 Logitech ERGO K860   

         CATEGORY                                        DESCRIPTION  \
0  IT Accessories  Scroll smarter: With Logitech Signature M650 W...   
1  IT Accessories  HERO Gaming Sensor: Next-gen HERO mouse sensor...   
2  IT Accessories  Compact Mouse: With a comfortable and contoure...   
3  IT Accessories  600 DPI Optical Mouse Sensor, 2.4GHz Wireless ...   
4  IT Accessories     Logitech ERGO K860 Wireless Ergonomic Keyboard   

  UNIT_PRICE SUPPLIER_ID   SUPPLIER_NAME  LEAD_TIME_DAYS  MIN_ORDER CURRENCY  \
0      34,75        S148    SpeedStorage              13         32     EURO   
1   

Step 3: Get Unique Cities

In [2]:
unique_cities = df['SUPPLIER_CITY'].dropna().unique()
print(unique_cities[:10])  # print first 10 for sanity check

['Frankfurt' 'Cologne' 'Dortmund' 'Essen' 'Hamburg' 'Munich' 'Berlin'
 'Stuttgart' 'Fürth' 'Langenhagen']


Step 4: Get Latitude and Longitude with Geopy

In [None]:
from geopy.geocoders import Nominatim
import time

geolocator = Nominatim(user_agent="geo_enricher")

# Function to get lat/lon for a city
def get_lat_lon(city):
    try:
        location = geolocator.geocode(city,timeout=4)
        if location:
            return pd.Series([location.latitude, location.longitude])
    except Exception as e:
        print(f"Error fetching for {city}: {e}")
    return pd.Series([None, None])

# Create a DataFrame mapping city to coordinates
city_coords = pd.DataFrame(unique_cities, columns=['SUPPLIER_CITY'])
city_coords[['CITY_LAT', 'CITY_LONG']] = city_coords['SUPPLIER_CITY'].apply(get_lat_lon)

# Merge back into original dataframe
df = df.merge(city_coords, on='SUPPLIER_CITY', how='left')
# Reorder columns
df = df[[col for col in df.columns if col != 'RATING'] + ['RATING']]

AttributeError: 'DataFrame' object has no attribute 'str'

Step 5: Save the Enriched Data

In [5]:
df.to_csv("new_product_with_geo.csv", sep=';', quotechar='"', index=False)
print("Done! File saved as 'new_product_with_geo.csv'")


Done! File saved as 'new_product_with_geo.csv'
