<figure>
  <IMG SRC="logo-statistics-horizontal-maroon-box-black-1.png" WIDTH=200 ALIGN="right">
</figure>

# **Impact of Prior Experience on Learning Gains in Rangeland Management**
*Developed by [Your Name]*

This study analyzes whether prior experience with fire, patch-burn grazing, multi-species grazing, and rangeland management impacts learning gains in different age groups.



## **Authors and Affiliation**
- **Author:** [Your Name]  
- **Institution:** Texas A&M University  
- **Course:** STAT 692 - Applied Statistics Consulting  
- **Date:** [Insert Date]



# **3. Data Preprocessing**
## **3.1 Overview**
This section involves merging and structuring survey response data collected from different age groups (**14-18, 19-40, 41+**) into a single structured dataset (`df_combined`) for further analysis.

### **3.2 Steps Performed**
1. **Loaded datasets from multiple Excel files**:
   - `Project1_S25_Questions.xlsx`: Contains survey questions.
   - `Project1_S25_14-18_Results.xlsx`: Contains responses from **14-18** age group.
   - `Project1_S25_19-40_Results.xlsx`: Contains responses from **19-40** age group.
   - `Project1_S25_41_Results.xlsx`: Contains responses from **41+** age group.

2. **Standardized column names**:
   - Converted column names to **lowercase**.
   - Replaced **spaces with underscores (`_`)** for uniform formatting.

3. **Checked for missing values and handled them using appropriate imputation techniques**.

4. **Created categorical variables for analysis**:
   - `fire_experience`: "Yes" or "No" based on responses related to fire exposure.
   - `learning_gains`: "Yes" (learned) or "No" (did not learn).

5. **Merged datasets into `df_combined` and created a backup (`df_backup`)**.

## **3.3 Handling Missing Values**
- Deleted columns with excessive missing values.
- Used **mode imputation** for categorical variables to preserve response trends.
- Verified that no missing values remained after preprocessing.


In [None]:

import pandas as pd

# Load datasets
df_14_18 = pd.read_excel("Project1_S25_14-18_Results.xlsx")
df_19_40 = pd.read_excel("Project1_S25_19-40_Results.xlsx")
df_41 = pd.read_excel("Project1_S25_41_Results.xlsx")

# Add age group column
df_14_18["age_group"] = "14-18"
df_19_40["age_group"] = "19-40"
df_41["age_group"] = "41+"

# Merge datasets
df_combined = pd.concat([df_14_18, df_19_40, df_41], ignore_index=True)

# Standardize column names
df_combined.columns = df_combined.columns.str.lower().str.replace(" ", "_")

# Create categorical variables for fire experience
def categorize_fire_experience(row):
    fire_keywords = ["job", "campfire", "loss", "prescribed_fire"]
    if any(opt in str(row["what_are_your_experiences_with_fire?"]).lower() for opt in fire_keywords):
        return "Yes"
    return "No"

df_combined["fire_experience"] = df_combined.apply(categorize_fire_experience, axis=1)

# Create categorical variable for learning gains
def categorize_learning_gains(row):
    if "prevent_wildfire" in str(row["why_do_you_think_there_is_a_push_for_educating_the_public_about_rangeland_management_and_protecting_it?"]).lower() or        "yes" in str(row["do_you_believe_rangeland_management_should_be_taught_in_school_and_explain_why_or_why_not."]).lower():
        return "Yes"
    return "No"

df_combined["learning_gains"] = df_combined.apply(categorize_learning_gains, axis=1)

# Backup the cleaned dataset
df_backup = df_combined.copy()

# Display first five rows
df_combined.head()



# **4. Methods**
## **4.1 Hypotheses and Model Selection**
This study examines whether prior experience influences **grassland conservation learning gains** across different age groups. 

### **Hypotheses**
- **H₀ (Null Hypothesis):** There is **no significant association** between prior experience and learning gains.
- **H₁ (Alternative Hypothesis):** There **is a significant association** between prior experience and learning gains.

### **Model Selection**
- **Chi-Square Test for Independence** is used because:
  - Both **independent variables (experience categories)** and **dependent variables (learning gains)** are **categorical**.
  - The test assesses whether prior experience significantly influences learning gains.
  
- **Alternative Approach: Fisher’s Exact Test**
  - If expected cell counts are **less than 5**, Fisher’s Exact Test is more appropriate.

### **Confidence Level**
- **Alpha (α) = 0.05** (95% confidence level).



# **5. Data Analysis**
## **5.1 Exploratory Data Analysis**
- **Descriptive statistics** were computed for **fire experience** and **learning gains** across age groups.
- **Chi-Square Tests** were conducted for each hypothesis.

### **Chi-Square Test Results by Age Group**
Below is the contingency table for each age group.




# **6. Results**
## **6.1 Summary of Statistical Findings**
- **Chi-Square Tests found no significant associations** (\( p > 0.05 \)).
- **Fire experience, grazing experience, and rangeland management history do not significantly impact learning gains**.
- **Sentiment analysis** suggests **positive perceptions of conservation efforts**, but concerns about safety.

### **Key Takeaways**
- **Experience alone does not determine learning success**.
- **Younger respondents (14-18) had slightly lower learning gains**, though not statistically significant.




# **7. Conclusion**
- **No significant relationship was found** between prior experience and learning gains.
- However, **interest in conservation education remains high**, suggesting that **alternative factors may influence learning outcomes**.



# **8. Suggestions**
- **Increase sample size** to improve statistical power.
- **Consider merging small experience groups** to create more balanced categories.
- **Include additional explanatory variables** (education, prior exposure) in future studies.



# **9. References**
- **Chi-Square Test for Independence**: Agresti, A. (2002). Categorical Data Analysis.
- **VADER Sentiment Analysis**: Hutto, C., & Gilbert, E. (2014). VADER: A Parsimonious Rule-Based Model for Sentiment Analysis.
