# Week 1: Introduction to Python and Pandas for GIS
This notebook contains three labs for Week 1 of the GIS Programming with Python course. Each lab progressively builds on the previous, covering the basics of using Python's Pandas library for data analysis and visualization, as well as introducing basic geospatial data handling with GeoPandas.

## Dataset:
We will use a dataset of 20 Nigerian cities with their latitude, longitude, population, and state information. Ensure you upload this dataset (nigerian_cities.csv) before starting the labs.

---


## Lab 1: Exploring Nigerian City Data with Pandas

### Objective:
In this lab, we will load and inspect a CSV dataset of Nigerian cities, perform basic filtering and sorting, and summarize the data.

### Instructions:
1. Load the dataset from a CSV file.
2. Display the first few rows to inspect the structure of the dataset.
3. Perform a basic filter to show cities with populations over 1 million.
4. Sort the cities by population in descending order.
5. Calculate summary statistics for the dataset.

Let's start by importing the necessary libraries and loading the dataset.


In [None]:
# Install all necessary packages
!pip install pandas shapely numpy matplotlib geopandas

In [None]:
# Import necessary libraries
import pandas as pd

# Load the CSV dataset (make sure to upload your CSV file into the notebook)
url = "nigerian_cities.csv"  # Adjust this path based on your uploaded file
cities_df = pd.read_csv(url)

# 1. Display the first 5 rows of the dataset to inspect it
print("First 5 rows of the dataset:")
cities_df.head()


### Data Inspection:
Now, let's check for any missing values and then filter the cities with populations greater than 1 million.


In [None]:
# 2. Check for missing values in the dataset
print("Missing values in the dataset:")
print(cities_df.isnull().sum())

# 3. Filter cities with population greater than 1 million
large_cities = cities_df[cities_df['Population'] > 1000000]

# Display the filtered results
print("\nCities with populations greater than 1 million:")
large_cities


### Sorting and Summarizing:
Finally, we will sort the filtered cities by population and calculate some basic statistics.


In [None]:
# 4. Sort the filtered cities by population in descending order
sorted_large_cities = large_cities.sort_values(by='Population', ascending=False)

# Display sorted results
print("Cities sorted by population (descending):")
sorted_large_cities

# 5. Calculate and display basic statistics
print("\nBasic statistics for cities with populations over 1 million:")
sorted_large_cities.describe()


## Basic Python Commands

Before we dive into GIS-specific tasks, let's review some basic Python commands that will help us manipulate data and files in GIS analysis.


In [None]:
# Variables in Python
x = 10
y = 20
print("The sum of x and y is:", x + y)

In [None]:
# Basic list manipulation
data_list = [1, 2, 3, 4, 5]
print("List before appending:", data_list)
data_list.append(6)
print("List after appending:", data_list)

In [None]:
# Dictionaries in Python
data_dict = {"name": "GIS", "tool": "Python", "purpose": "Spatial Analysis"}
print("Dictionary:", data_dict)

---
## Reflection:
- How many Nigerian cities have populations greater than 1 million?
- What is the average population of these cities?


## Lab 2: Plotting Nigerian Cities on a Map

### Objective:
In this lab, we will create a scatter plot to visualize the Nigerian cities based on their latitude and longitude. We will also adjust the marker sizes based on the cities' populations.

### Instructions:
1. Extract the latitude, longitude, and population data.
2. Use `matplotlib` to create a scatter plot.
3. Adjust the marker sizes according to the population and customize the plot.

Let's import the necessary libraries and create the plot.


In [None]:
# Import matplotlib for plotting
import matplotlib.pyplot as plt

# Extract latitude, longitude, and population data for plotting
latitudes = cities_df['Latitude']
longitudes = cities_df['Longitude']
populations = cities_df['Population'] / 100000  # Scale population for marker size

# 1. Create a scatter plot with latitude and longitude
plt.figure(figsize=(10, 8))
plt.scatter(longitudes, latitudes, s=populations, c='blue', alpha=0.5)

# 2. Customize the plot with title and labels
plt.title('Nigerian Cities by Population')
plt.xlabel('Longitude')
plt.ylabel('Latitude')

# 3. Show the plot
plt.show()


---
## Reflection:
- Which areas of Nigeria have the most densely clustered cities?
- Are there any noticeable spatial patterns in the distribution of large cities?


## Lab 3: Loading and Visualizing Nigerian State Boundaries

### Objective:
This lab introduces GeoPandas for handling spatial data. You will load Nigerian state boundaries from a shapefile and overlay the cities on top of the state map.

### Instructions:
1. Load the Nigerian state boundaries shapefile.
2. Visualize the state boundaries using GeoPandas.
3. Convert the cities dataset to a GeoDataFrame and overlay the cities on the state boundary map.

Let's begin by importing GeoPandas and loading the shapefile.


In [None]:
# Import Geopandas for spatial data handling
import geopandas as gpd
from shapely.geometry import Point

# Load the Nigerian state boundaries shapefile
# Replace with the path to your Nigerian state boundaries shapefile (.shp)
nigeria_states = gpd.read_file('nigerian_states.shp')

# 1. Plot Nigerian state boundaries
nigeria_states.plot(figsize=(10, 8), color='lightgrey', edgecolor='black')
plt.title('Nigerian State Boundaries')
plt.show()


### Converting Cities to GeoDataFrame:
Now, we will convert the cities DataFrame into a GeoDataFrame so that we can plot the cities on the map.


In [None]:
# 2. Convert cities_df to a GeoDataFrame with points for geometry
cities_df['geometry'] = cities_df.apply(lambda x: Point((x['Longitude'], x['Latitude'])), axis=1)
cities_gdf = gpd.GeoDataFrame(cities_df, geometry='geometry')

# 3. Plot Nigerian states and overlay cities
base = nigeria_states.plot(figsize=(10, 8), color='lightgrey', edgecolor='black')
cities_gdf.plot(ax=base, marker='o', color='red', markersize=5)
plt.title('Nigerian Cities and State Boundaries')
plt.show()
