# Data Analysis Stolen Vehicle

## Step 1: Import Necessary Libraries

In this step, we import the required Python libraries, NumPy and pandas, for data manipulation and analysis.

In [None]:
import numpy as np
import pandas as pd

## Step 2: Load the Dataset

We specify the full path to the CSV file containing our dataset, which is stored at csv_file_path.
The data is loaded from the CSV file into a pandas DataFrame using pd.read_csv(), with ',' (comma) as the delimiter.

In [None]:
# Specify the full path to the CSV file
csv_file_path = "/Users/muhammadfauzy/Documents/1. FOLDER KERJAAN/1. Data Analyst /2. DataSet/DataSet Stolen Vehicle/StolenVehicles.csv"

# Load the data from the CSV file with ',' as the delimiter
data = pd.read_csv(csv_file_path, sep=',')


## Step 3: Print Headers

In this step, we extract and print the column names (headers) of the DataFrame using data.columns.

In [None]:
# Get the headers (column names) of the DataFrame
headers = data.columns

# Print the headers
print(data.head(5))


## Step 4: Check for Missing Values

We perform missing value analysis on the dataset.
data.isnull().sum() calculates the count of missing values in each column of the DataFrame.
data['DateStolen'].isnull().sum() calculates the count of missing values specifically in the 'DateStolen' column.

In [None]:
# Check for missing values in the entire DataFrame
missing_values = data.isnull().sum()

# Check for missing values in specific columns, e.g., 'DateStolen'
missing_date_values = data['DateStolen'].isnull().sum()

# Print the results
print("Missing Values in the Entire DataFrame:")
print(missing_values)

print("\nMissing Values in 'DateStolen' Column:")
print(missing_date_values)


Step 5 : Drop rows for missing value

Using Drop Rows value because the missing value is isnt significant


## Using API to get Latitude Data

If you have the location names (city names) in your dataset and you want to obtain their corresponding geospatial coordinates (latitude and longitude) using an API, you can use a geocoding API service like the Google Maps Geocoding API or a similar service. Here's a general outline of how you can achieve this:

1. **Obtain an API Key:**
   - Sign up for an account with a geocoding service provider (e.g., Google Maps).
   - Create a project and obtain an API key, which you'll use to make API requests.

2. **Make API Requests:**
   - Use Python to make API requests to the geocoding service provider's API.
   - Pass the city names from your dataset as input to the API.
   - Receive the geospatial coordinates (latitude and longitude) as the API response.

3. **Update Your Dataset:**
   - Add new columns for latitude and longitude to your dataset.
   - Populate these columns with the geospatial coordinates obtained from the API.

Here's an example of how you can use the Google Maps Geocoding API to obtain geospatial coordinates for city names and update your dataset:

```python
import requests
import pandas as pd

# Replace 'YOUR_API_KEY' with your actual Google Maps Geocoding API key
api_key = 'YOUR_API_KEY'

# Function to get geospatial coordinates for a given city name
def get_coordinates(city_name):
    url = f'https://maps.googleapis.com/maps/api/geocode/json?address={city_name}&key={api_key}'
    response = requests.get(url)
    data = response.json()
    
    if data['status'] == 'OK':
        # Extract latitude and longitude from the API response
        location = data['results'][0]['geometry']['location']
        return location['lat'], location['lng']
    else:
        return None, None

# Load your cleaned dataset into a DataFrame (assuming it contains a 'Location' column with city names)
data_cleaned = pd.read_csv('your_cleaned_dataset.csv')

# Create new columns for latitude and longitude
data_cleaned['Latitude'] = None
data_cleaned['Longitude'] = None

# Iterate over the 'Location' column and get geospatial coordinates
for index, row in data_cleaned.iterrows():
    city_name = row['Location']
    lat, lng = get_coordinates(city_name)
    data_cleaned.at[index, 'Latitude'] = lat
    data_cleaned.at[index, 'Longitude'] = lng

# Save the updated dataset
data_cleaned.to_csv('updated_dataset_with_coordinates.csv', index=False)
```

In this code:

- You define a function `get_coordinates(city_name)` to make API requests to Google Maps Geocoding API and extract latitude and longitude.
- You load your cleaned dataset and create new columns for latitude and longitude.
- You iterate over the 'Location' column, call the API to get coordinates for each city name, and update the corresponding rows in the dataset.
- Finally, you save the updated dataset with the geospatial coordinates to a new CSV file.

Please ensure you have the necessary permissions and billing set up with the geocoding API service you choose to use.