# 00_scope.ipynb
# Climate Wine Project – Spatial Analysis of Wine Production in Alentejo

## 1. Introduction
This notebook defines the scope of the spatial analysis project of wine production in Alentejo, Portugal. 
The goal is to investigate how climatic variables (annual mean temperature and precipitation) influence wine production, identify spatial patterns,
and detect potential areas of risk or opportunity.

## 2. Main Question
**How do climatic variables (annual mean temperature and precipitation) influence wine production by municipality in Alentejo?**

## 3. Sub-Questions
1. **Which municipalities have the highest productivity, and how does this relate to climate?**  
   - Deliverable: map of wine production per municipality with climate overlays.
2. **Where are there areas of high production under high climatic stress?**  
   - Deliverable: hotspot maps showing production in regions with climatic stress.
3. **Are there clear spatial patterns in production distribution (high or low production clusters)?**  
   - Deliverable: spatial cluster analysis using QGIS and Python (geopandas, rasterio, matplotlib).
4. **How does climate variability over the years affect average production per municipality?**  
   - Deliverable: time series graphs, trend analysis, and temporal correlation analysis.

## 4. Objectives
- Demonstrate integration of **spatial analysis and statistics**.  
- Apply **Python (pandas, geopandas, rasterio, matplotlib, seaborn)** and **QGIS** for georeferenced data analysis.  
- Produce **clear visualizations** to interpret patterns and risks in viticultural production.

## 5. Deliverables
- File: `notebooks/00_scope.ipynb` → questions, scope, and analysis plan.  
- Maps, graphs, and statistical analyses addressing each sub-question.  
- Organized dataset for future analyses (CSV, shapefiles, raster).

## 6. Tools
- **QGIS**: visualization and spatial analysis.  
- **Python**: pandas, geopandas, rasterio, matplotlib, seaborn.  
- **Git/GitHub**: version control and online portfolio.


In [1]:
# Sub-question 1: Which municipalities have the highest productivity, and how does this relate to climate?

# TODO: Import necessary libraries
# import pandas as pd
# import geopandas as gpd
# import matplotlib.pyplot as plt

# TODO: Load wine production data
# wine_data = pd.read_csv("data/wine_production.csv")

# TODO: Load climate data (temperature, precipitation)
# climate_data = pd.read_csv("data/climate.csv")

# TODO: Merge datasets on municipality
# merged_data = wine_data.merge(climate_data, on="municipality")

# TODO: Create a map showing wine production per municipality
# TODO: Overlay climate variables on the map


In [2]:
# Sub-question 2: Where are there areas of high production under high climatic stress?

# TODO: Identify hotspots in production data
# TODO: Combine with climate stress variables
# TODO: Visualize hotspots on a map using geopandas/matplotlib


In [3]:
# Sub-question 3: Are there clear spatial patterns in production distribution?

# TODO: Perform spatial clustering analysis
# TODO: Identify clusters of high/low production
# TODO: Visualize clusters on a map


In [4]:
# Sub-question 4: How does climate variability over the years affect average production per municipality?

# TODO: Load historical climate and production data
# TODO: Calculate yearly averages per municipality
# TODO: Perform correlation/trend analysis
# TODO: Plot time series graphs showing trends


## 2. Data Collection

This phase involved collecting the datasets required for the spatial analysis of wine production in Alentejo.
The datasets include production data, administrative boundaries, and climate variables.

### Datasets

- **Wine production per municipality**  
  - Source: IVV / INE (CSV format)  
  - Contains yearly wine production data for each municipality.

- **Administrative boundaries (municipalities)**  
  - Source: DGT / GADM (Shapefile / GeoJSON)  
  - Geospatial boundaries of municipalities in Alentejo.

- **Climate data**  
  - Source: WorldClim / Copernicus  
  - Raster format (GeoTIFF) with annual mean temperature and total precipitation.

- **Optional – Vegetation data**  
  - Source: Sentinel-2 NDVI / Copernicus  
  - Provides vegetation information to complement the analysis.

### Data Management

- `data_raw/` contains the original datasets downloaded from official sources.
- `data_clean/` contains cleaned and processed datasets ready for analysis.

### Notes
 
- File names are organized for clarity (e.g., `wine_production.csv`, `temperature.tif`).  
- Raster and vector data are aligned in the same coordinate reference system (CRS) for spatial analysis.


In [4]:
# Load raw wine production data
import pandas as pd

wine_raw = pd.read_csv("climate-wine-project/data_raw/wine_production_raw.xlsx")
wine_raw.head()  # Display first few rows



FileNotFoundError: [Errno 2] No such file or directory: 'climate-wine-project/data_raw/wine_production_raw.xlsx'

In [None]:
# Load raw administrative boundaries
import geopandas as gpd

boundaries_raw = gpd.read_file("data_raw/municipalities_raw.shp")
boundaries_raw.head()  # Display first few rows


In [None]:
# Load raw temperature raster
import rasterio

temp_raw = rasterio.open("data_raw/temperature_raw.tif")
print(temp_raw)  # Show basic info about the raster



In [None]:
# Load raw precipitation raster
precip_raw = rasterio.open("data_raw/precipitation_raw.tif")
print(precip_raw)  # Show basic info about the raster



In [None]:
# Load raw NDVI raster (optional)
# If you downloaded vegetation data
ndvi_raw = rasterio.open("data_raw/ndvi_raw.tif")
print(ndvi_raw)  # Show basic info about the raster
