# **Assignment 2.01: PM2.5 Air Quality Analysis Using Google Earth Engine**

# **Task 1: Select Study Region**

The selected study area is Bakersfield, California, USA, with latitude 35.3733° N and longitude 119.0189° W. This area is at a unique location where its surrounding mountains may trap pollution within the region. There are also a broad range of pollution sources in Bakersfield, California, such as toxic farming chemicals, dust, transportation fumes, polluted winds from southern California, and Bakerfield"s oil-drilling exhausts. Additionally, the risk of wildfire increases due to climate change. The objective of this research is to understand and find the connection between spikes of pollution and its potential cause whether it is because of season, industrial activity, wildfire, or a combination of these events.

# **Task 2: Conduct PM2.5 Time Series Analysis**

### Instructions:

Following the workflow demonstrated in the class materials, create **three different visualizations** of PM2.5 data for your chosen location. You must adapt the code examples to your specific region and time period.

### Required Analyses:

**Analysis A: Long-term Trend (MERRA-2 Dataset)**
- Use MERRA-2 data for a **2-year period** (2020-2021 or 2021-2022)
- Create a daily time series chart
- Focus on identifying long-term patterns and unusual events

**Analysis B: High-Resolution Monthly Analysis (GHAP Dataset)**  
- Use GHAP data for **one full year** (choose from 2017-2022)
- Create monthly aggregated PM2.5 chart
- **CRITICAL:** Apply the 0.1 scale factor for proper unit conversion

**Analysis C: Seasonal Pattern Analysis**
- Use either GHAP or MERRA-2 data
- Create a seasonal comparison chart (Spring, Summer, Fall, Winter)
- Calculate seasonal averages for your location

### Code Adaptation Requirements:
- **Change the coordinates** to your chosen city location
- **Adjust time periods** as specified for each analysis
- **Modify chart titles** to reflect your study area
- **Ensure proper scale factors** are applied (especially for GHAP data)
- **Use appropriate pixel scales** (1000m for GHAP, 50000m for MERRA-2)

### Technical Notes:
- For MERRA-2: Use the dust component (DUCMASS25) or construct full PM2.5 using the formula shown in class
- For GHAP: Must apply 0.1 scale factor to convert to μg/m³
- Keep monthly aggregations to avoid computational limits

In [None]:
var interested_region = ee.Geometry.Point([-119.02, 35.37])
var start = "2021-01-01";
var end = "2022-12-31";
// PM2.5 reconstruction formula = Dust surface + Sea salt + Organic carbon + Black Carbon + Sulfate factor*SO4
var hourly_concentration = ee.ImageCollection("NASA/GSFC/MERRA/aer/2")
.select(["DUSMASS25", "SSSMASS25", "OCSMASS", "BCSMASS", "SO4SMASS"])
.filterDate(start, end)
.filterBounds(interested_region);

// Build in PM2.5 units kg/m^3
var toPM25 = function(img) {
  var so4_conv = img.select("SO4SMASS").multiply(1.375); // Sulfate factor
  var pm25 = img.select(["DUSMASS25", "SSSMASS25", "OCSMASS", "BCSMASS"])
  .addBands(so4_conv.rename("SO4_conv"))
  .reduce(ee.Reducer.sum())
  .rename("PM25");
  return pm25.copyProperties(img, ["system:time_start","system:time_end"]);
};

var hourlyPM = hourly_concentration.map(toPM25).filter(ee.Filter.notNull(["system:time_start"]));

var nDays = ee.Date(end).difference(ee.Date(start), "day");
var days = ee.List.sequence(0, nDays.subtract(1));

// Calculate daily kg/m^3
var dailyPM = ee.ImageCollection.fromImages(days.map(function(d){
  var d0 = ee.Date(start).advance(d, "day");
  var d1 = d0.advance(1, "day");
  var subset = hourlyPM.filterDate(d0, d1);
  return ee.Algorithms.If(
    subset.size().gt(0),
    subset.mean().set("system:time_start", d0.millis()),
    null
    );
  })).filter(ee.Filter.notNull(["system:time_start"]));

// Convert daily averages to µg/m^3 (microgram/m^3)
var dailyPM_micrograms = dailyPM.map(function(img) {
  var time_start = ee.Date(img.get("system:time_start"));
  var time_end = ee.Date(img.get("system:time_end"));
    return img.multiply(1e9).rename("PM25_microgramm3")
    .copyProperties(img,  ["system:time_start","system:time_end"]);
});

// Chart
var chart = ui.Chart.image.series({
  imageCollection: dailyPM_micrograms,
  region: interested_region,
  reducer: ee.Reducer.mean(),
  scale: 50000
}).setOptions({
  title: "Bakersfield, CA - Daily Surface PM2.5 (MERRA-2, 2021-2022)",
  vAxis: {title: "PM2.5 (µg/m^3)"},
  hAxis: {title: "Date"},
  interpolateNulls: true,
  lineWidth: 2,
  pointSize: 1,
});

print(chart);

In [None]:
// GHAP data for 2021
var interested_region = ee.Geometry.Point([-119.02, 35.37]) // Bakersfield, California
var start = "2021-01-01";
var end = "2021-12-31";

var ghap = ee.ImageCollection("projects/sat-io/open-datasets/GHAP/GHAP_D1K_PM25")
  .filterDate(start, end)
  .filterBounds(interested_region);

// Check availability
print("Total daily images:", ghap.size());

// Scale factor: 0.1 -> convert to µg/m³
var applyScale = function(img) {
  return img.multiply(0.1).copyProperties(img, ["system:time_start"]);
};
var daily = ghap.map(applyScale);

// Monthly means (12 images, tag month and canonical time)
var monthlyAverages = ee.ImageCollection.fromImages(
  ee.List.sequence(1, 12).map(function(m) {
    var i = daily.filter(ee.Filter.calendarRange(m, m, "month")).mean();
    return i.set("system:time_start", ee.Date.fromYMD(2021, m, 1));
  })
);

// Monthly time series chart
var chart = ui.Chart.image.series({
  imageCollection: monthlyAverages,
  region: interested_region,
  reducer: ee.Reducer.mean(),
  scale: 1000
}).setOptions({
  title: "Bakersfield, California - Monthly PM2.5 (GHAP, 2021)",
  vAxis: {title: "PM2.5 (µg/m³)"},
  hAxis: {title: "Date"},
  lineWidth: 3,
  pointSize: 4
});
print(chart);

// Optional: daily chart (can be heavy)
var dailyChart = ui.Chart.image.series({
  imageCollection: daily,
  region: roi,
  reducer: ee.Reducer.mean(),
  scale: 1000
}).setOptions({
  title: "Bakersfield, California - Daily PM2.5 (GHAP, 2020)",
  vAxis: {title: "PM2.5 (µg/m³)"},
  hAxis: {title: "Date"},
  lineWidth: 1
});
print(dailyChart);

In [None]:
// Continue from previous cell's code...
// Season label from month

// GHAP daily PM2.5 (1 km) -> apply scale factor 0.1 and rename to PM25
var daily = ee.ImageCollection("projects/sat-io/open-datasets/GHAP/GHAP_D1K_PM25")
  .filterDate(start, end)
  .filterBounds(interested_region)
  .map(function(img){
    return img.multiply(0.1)
              .rename("PM25")
              .copyProperties(img, ["system:time_start"]);
  });

// Monthly means (12 images, tag month and canonical time)
var monthlyAverages = ee.ImageCollection.fromImages(
  ee.List.sequence(1, 12).map(function(m){
    var mImg = daily.filter(ee.Filter.calendarRange(m, m, "month")).mean();
    return mImg.set({
      "month": m,
      "system:time_start": ee.Date.fromYMD(2020, m, 1).millis()
    });
  })
);

var addSeason = function(img){
  var m = ee.Number(img.get("month"));
  var season = ee.String(ee.Algorithms.If(
    m.gte(3).and(m.lte(5)), "Spring",
    ee.Algorithms.If(
      m.gte(6).and(m.lte(8)), "Summer",
      ee.Algorithms.If(
        m.gte(9).and(m.lte(11)), "Fall", "Winter"
      )
    )
  ));
  return img.set("season", season);
};
var monthlySeasoned = monthlyAverages.map(addSeason);

// Compute seasonal averages at ROI (returns a compact FeatureCollection)
var seasonNames = ee.List(["Spring", "Summer", "Fall", "Winter"]);
var seasonalFC = ee.FeatureCollection(seasonNames.map(function(s){
  var meanImg = monthlySeasoned.filter(ee.Filter.eq("season", s)).mean();
  var seasonAverage = meanImg.reduceRegion({
    reducer: ee.Reducer.mean(),
    geometry: interested_region,
    scale: 1000
  }).get("PM25");
  return ee.Feature(null, {season: s, pm25: seasonAverage});
}));

print("Seasonal PM2.5 Averages (GHAP 2021, µg/m³):", seasonalFC);

// Column chart with clean axes (y starts at 0, x labels hidden)
var chart = ui.Chart.feature.byFeature({
  features: seasonalFC,
  xProperty: "season",
  yProperties: ["pm25"]
})
.setChartType("ColumnChart")
.setOptions({
  title: "Seasonal PM2.5 at Bakersfield, California (GHAP 2020)",
  hAxis: {title: "Season", ticks: []},
  vAxis: {title: "PM2.5 (µg/m³)", viewWindow: { min: 0 }    // start y-axis at 0
  },
  legend: { position: "none" },
  colors: ["#66c2a5"],
  bar: { groupWidth: "80%" }
});
print(chart);

# Task 3: Generate and Export Results

1. **Run all three analyses** in Google Earth Engine
2. **Save high-quality screenshots** of each chart/visualization
3. **Record key numerical results** (seasonal averages, peak values, etc.)
4. **Note any interesting patterns** or unusual events in your data

### Required Figures:
- Figure 1: Long-term time series (MERRA-2)
- Figure 2: Monthly PM2.5 patterns (GHAP)
- Figure 3: Seasonal comparison chart
- All figures must include proper titles, axis labels, and units

# **Assignment Report (10 points total)**

### Instructions:
Write a brief report (**maximum 2 pages**) describing your work and findings. **Do not include any code in the report.**

### Report Structure and Point Breakdown:

**1. Introduction (2 points)**
- Describe your chosen study area and its location
- Explain why you selected this area (industrial activity, wildfire risk, population density, personal interest, etc.)
- Provide context about known air quality issues or pollution sources in the region
- State your analysis objectives

**2. Methods (1 point)**
- Briefly describe the MERRA-2 and GHAP datasets (spatial/temporal resolution, units)
- Explain your analytical approach for each of the three analyses
- Mention the time periods you selected and why
- **Do not include any code in the report**

**3. Results (5 points)**
- **Include all three figures** with clear captions
- Describe the PM2.5 patterns you observe in each analysis:
  - Long-term trends: Any increasing/decreasing patterns? Unusual spikes?
  - Monthly patterns: Which months show highest/lowest PM2.5?
  - Seasonal patterns: Which season has the worst air quality? Best?
- **Provide specific quantitative values** (peak PM2.5 levels, seasonal averages, etc.)
- Compare patterns between the different datasets/time scales
- Discuss any seasonal variations and what might cause them

**4. Discussion and Conclusion (2 points)**
- Interpret your results in the context of your study area (what causes the patterns you observed?)
- How do your findings compare to known air quality issues in your region?
- What are the implications for public health in your study area?
- Comment on the usefulness of satellite data for air quality monitoring
- What did you learn about your chosen region"s air quality?

### Requirements:
- **Maximum 2 pages** including all figures
- Include all three figures in the Results section
- Professional formatting with proper figure captions
- **Do not include any code in the report**
- Proper spelling and grammar
- Save as PDF for submission

---

## Submission Requirements

### Files to Submit

**I. Report (graded component):**
   - Submit a maximum 2-page report as a PDF through Canvas
   - Name: `LastName_FirstName_Report.pdf`
   - **Do not include code within the report**

**II. All Other Materials (via Git):**
   - Organize all working assignment files in the Git repository
      - GEE script (JavaScript code saved as `.txt`; name: `LastName_FirstName_GEE_Script.txt`)
      - Screenshots (all three visualization charts as high-quality images; names: `LastName_FirstName_Figure1.png`, `LastName_FirstName_Figure2.png`, `LastName_FirstName_Figure3.png`)
      - Any additional notes or intermediate results (optional)
   - Maintain clear file organization within the repository; commit all files with logical folder structure

**III. Completed AI-Usage form (via Git)**

---

## Grading Rubric (10 Points Total)

**Only the report will be graded. All other materials (code, screenshots, data files) are for your learning and reference but will not be evaluated.**

| Report Section | Points | Criteria |
|----------------|--------|----------|
| Introduction | 2 | Clear area description, rationale for selection, relevant background context, stated objectives |
| Methods | 1 | Understanding of datasets, appropriate time period selection, clear methodology explanation |
| Results | 5 | All three figures included with proper captions, detailed pattern descriptions, specific quantitative observations, comparison between analyses |
| Discussion/Conclusion | 2 | Thoughtful interpretation of results, connection to regional context, public health implications, personal insights |
| **Total** | **10** | **Professional writing, proper formatting, within 2-page limit** |

---

## Tips for Success

**Study Area Selection:**
- Choose areas with known air quality issues for more interesting results
- Consider seasonal factors (wildfire seasons, winter heating, summer ozone)
- Urban areas typically show more variation than rural locations

**Technical Tips:**
- Start early to allow time for troubleshooting
- Use the Inspector tool to verify your data values make sense
- Double-check unit conversions (GHAP requires 0.1 scale factor)
- If charts appear blank, check your coordinate location and date ranges
- Save your work frequently

**Analysis Tips:**
- Look for seasonal patterns (typically winter peaks, summer minimums in temperate regions)
- Consider what might cause spikes (wildfires, industrial events, weather patterns)
- Compare your results across different time scales (daily vs monthly vs seasonal)

**Report Tips:**
- Include specific values (not just "high" or "low" - give actual μg/m³ values)
- Connect your results to local knowledge about your study area
- Use proper figure captions (e.g., "Figure 1: Daily PM2.5 concentrations in Denver, CO...")
- Proofread carefully before submission
- Write clearly and concisely