# **DEON ETHICS CHECKLIST**

# **Step 1**: Install Deon Library.

In [None]:
!pip install deon



Collecting deon
  Downloading deon-0.3.0.tar.gz (26 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: deon
  Building wheel for deon (pyproject.toml) ... [?25l[?25hdone
  Created wheel for deon: filename=deon-0.3.0-py3-none-any.whl size=21352 sha256=1161b4095b5b5e4f300be85ec2ae8f242c6eb1fae6e351ae200e9edac5fb9e41
  Stored in directory: /root/.cache/pip/wheels/65/8e/b0/6215389003a488515cb8287265bc54506a636bbf043d79c2b8
Successfully built deon
Installing collected packages: deon
Successfully installed deon-0.3.0


# **Step 2:  Check Available Commands**

In [None]:
!deon --help


Usage: deon [OPTIONS]

  Easily create an ethics checklist for your data science project.

  The checklist will be printed to standard output by default. Use the
  --output option to write to a file instead.

Options:
  -l, --checklist PATH  Override default checklist file with a path to a
                        custom checklist.yml file.
  -f, --format TEXT     Output format. Default is "markdown". Can be one of
                        [ascii, html, jupyter, markdown, rmarkdown, rst].
                        Ignored and file extension used if --output is passed.
  -o, --output PATH     Output file path. Extension can be one of [.txt,
                        .html, .ipynb, .md, .rmd, .rst]. The checklist is
                        appended if the file exists.
  -w, --overwrite       Overwrite output file if it exists. Default is False,
                        which will append to existing file.
  -m, --multicell       For use with Jupyter format only. Write checklist with
            

# **Step 3: Create the Deon Checklist**

In [None]:
# Generate the checklist using the Deon CLI in markdown format
!deon --output wildfire_ethics_checklist.md --format markdown


Checklist successfully written to file wildfire_ethics_checklist.md.


# **Step 4: Read and Display the Checklist**

In [None]:
# Read and display the generated checklist
with open('wildfire_ethics_checklist.md', 'r') as file:
    checklist_content = file.read()

print(checklist_content)


# Data Science Ethics Checklist

[![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/)

## A. Data Collection
 - [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?
 - [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?
 - [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?
 - [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?

## B. Data Storage


# **Customize the checklist f**or your specific project, in this case the project is the prediction of forest wild fires, the input variables are Temperture, Humidify, Wind speed, Rainfall, Fuel Moisture, Vegetation type, Slope, Region. The output variables are size, duration, suppresion cost and fire occurrence

In [None]:
# Customized Deon Ethics Checklist for Forest Wildfire Prediction Model

custom_checklist = """
# Ethics Checklist for Forest Wildfire Prediction Model

## 1. Data Collection
- **Input Variables**: Temperature, Humidity, Wind Speed, Rainfall, Fuel Moisture, Vegetation Type, Slope, Region
- **Output Variables**: Fire Size, Fire Duration, Suppression Cost, Fire Occurrence
- Are the data sources, such as weather, vegetation, and geographic data, properly licensed and legally available?
- Has any sensitive information, such as private property or personal location data, been anonymized?
- Have you obtained consent for data collected from private or proprietary sources, such as satellite imagery or drone footage?

## 2. Fairness & Justice
- How will you ensure that the model’s predictions are fair and do not disproportionately affect specific regions or communities (e.g., indigenous lands, rural areas)?
- What biases might exist in the historical data (e.g., underreporting of fires in certain regions), and how will you address these to ensure the model does not unfairly target or neglect specific areas?
- How will you balance fairness in handling both false positives (predicting a fire where there is none) and false negatives (failing to predict a fire)?
- Have you tested the model across different regions to ensure consistent performance across various forest types (tropical, temperate, etc.)?

## 3. Transparency
- How will you ensure transparency about the data sources, algorithms, and decision-making process of the model?
- What information will you make available to government agencies, the public, and environmental organizations?
- How will you communicate the model’s predictions and limitations to decision-makers so that they understand the risks involved?
- How will you explain false positives and false negatives to the affected communities or stakeholders, especially during critical events like evacuations?

## 4. Privacy
- How will you ensure the privacy of individuals whose data might be inadvertently captured (e.g., campers, rural residents) through satellite images, drones, or weather stations?
- What steps will you take to prevent the misuse of this data, especially in terms of tracking human activities in forest areas without their consent?
- If external data sources, such as drones or surveillance tools, are integrated into the model, how will you balance the need for accurate predictions with protecting individual privacy?

## 5. Accountability
- Who will be held accountable if the model incorrectly predicts a wildfire, resulting in unnecessary evacuations or failure to prevent a disaster?
- What system will you establish to monitor and adjust the model over time, ensuring it adapts to changing environmental and climate conditions?
- How will you communicate accountability measures to the public, especially in high-risk areas where wildfire prediction is critical?

## 6. Inclusivity
- How will you ensure the model includes diverse data from different types of forests (e.g., tropical, temperate) and regions, especially those that may be underrepresented in historical data collection?
- How will you ensure the model accounts for the needs of different communities, including vulnerable populations such as indigenous groups or rural residents who have unique relationships with the land?
- If certain regions or communities lack sufficient data (e.g., underreporting, lack of resources), how will you address this to avoid biased predictions?

## 7. Sustainability
- How will the model’s predictions affect long-term forestry practices, land management, and firefighting strategies over time?
- How will you ensure the model remains sustainable, considering the evolving nature of climate change and its effects on wildfire patterns?
- What are the broader social and environmental implications if this model becomes widely adopted (e.g., impacts on land use, deforestation policies, wildlife conservation)?
"""

# Save the customized checklist to a new markdown file
with open('custom_wildfire_ethics_checklist.md', 'w') as file:
    file.write(custom_checklist)

print("Custom ethics checklist saved as 'custom_wildfire_ethics_checklist.md'")


Custom ethics checklist saved as 'custom_wildfire_ethics_checklist.md'


# **Steps to host this in huggingface**

Step 1: Upload the custom file and ethics checklist to hugging face

Step 2: Add an app.py file with the information below

In [None]:
import gradio as gr

# Markdown content
md_content = """
# Ethics Checklist for Forest Wildfire Prediction Model

## 1. Data Collection
- **Input Variables**: Temperature, Humidity, Wind Speed, Rainfall, Fuel Moisture, Vegetation Type, Slope, Region
- **Output Variables**: Fire Size, Fire Duration, Suppression Cost, Fire Occurrence
- Are the data sources, such as weather, vegetation, and geographic data, properly licensed and legally available?
- Has any sensitive information, such as private property or personal location data, been anonymized?
- Have you obtained consent for data collected from private or proprietary sources, such as satellite imagery or drone footage?

## 2. Fairness & Justice
- How will you ensure that the model’s predictions are fair and do not disproportionately affect specific regions or communities (e.g., indigenous lands, rural areas)?
- What biases might exist in the historical data (e.g., underreporting of fires in certain regions), and how will you address these to ensure the model does not unfairly target or neglect specific areas?
- How will you balance fairness in handling both false positives (predicting a fire where there is none) and false negatives (failing to predict a fire)?
- Have you tested the model across different regions to ensure consistent performance across various forest types (tropical, temperate, etc.)?

## 3. Transparency
- How will you ensure transparency about the data sources, algorithms, and decision-making process of the model?
- What information will you make available to government agencies, the public, and environmental organizations?
- How will you communicate the model’s predictions and limitations to decision-makers so that they understand the risks involved?
- How will you explain false positives and false negatives to the affected communities or stakeholders, especially during critical events like evacuations?

## 4. Privacy
- How will you ensure the privacy of individuals whose data might be inadvertently captured (e.g., campers, rural residents) through satellite images, drones, or weather stations?
- What steps will you take to prevent the misuse of this data, especially in terms of tracking human activities in forest areas without their consent?
- If external data sources, such as drones or surveillance tools, are integrated into the model, how will you balance the need for accurate predictions with protecting individual privacy?

## 5. Accountability
- Who will be held accountable if the model incorrectly predicts a wildfire, resulting in unnecessary evacuations or failure to prevent a disaster?
- What system will you establish to monitor and adjust the model over time, ensuring it adapts to changing environmental and climate conditions?
- How will you communicate accountability measures to the public, especially in high-risk areas where wildfire prediction is critical?

## 6. Inclusivity
- How will you ensure the model includes diverse data from different types of forests (e.g., tropical, temperate) and regions, especially those that may be underrepresented in historical data collection?
- How will you ensure the model accounts for the needs of different communities, including vulnerable populations such as indigenous groups or rural residents who have unique relationships with the land?
- If certain regions or communities lack sufficient data (e.g., underreporting, lack of resources), how will you address this to avoid biased predictions?

## 7. Sustainability
- How will the model’s predictions affect long-term forestry practices, land management, and firefighting strategies over time?
- How will you ensure the model remains sustainable, considering the evolving nature of climate change and its effects on wildfire patterns?
- What are the broader social and environmental implications if this model becomes widely adopted (e.g., impacts on land use, deforestation policies, wildlife conservation)?
"""

def display_markdown():
    return md_content

# Create a Gradio interface
iface = gr.Interface(fn=display_markdown, inputs=[], outputs="markdown")
iface.launch()


ModuleNotFoundError: No module named 'gradio'

# **ETHICS DataCAD**

An Ethics DataCard is a documentation tool designed to provide a detailed, transparent account of the ethical considerations surrounding a dataset or model. It aims to highlight key aspects such as data collection, privacy, biases, fairness, accountability, and societal impact, ensuring that stakeholders understand the potential ethical risks and implications. Unlike Deon, which offers a predefined checklist for ethics in data science, Ethics DataCards are more customizable and focus on a narrative approach, allowing for more specific insights into the dataset or model's lifecycle. The advantage of DataCards lies in their flexibility and depth, enabling project-specific ethical assessments, while Deon's checklist approach offers a quick, standardized framework. However, Deon may be easier to implement for teams seeking a simple, checklist-based structure, whereas DataCards require more detailed input and customization, making them more resource-intensive but richer in context.

# **Step 1: Define the Ethics DataCad contents**


In [None]:
!pip install gradio
import gradio as gr

# Ethics DataCard content for forest wildfire prediction model
datacard_content = """
# Ethics DataCard for Forest Wildfire Prediction Model

## Dataset Overview
- **Input Variables**: Temperature, Humidity, Wind Speed, Rainfall, Fuel Moisture, Vegetation Type, Slope, Region
- **Output Variables**: Fire Size, Fire Duration, Suppression Cost, Fire Occurrence

## Data Collection Process
- Data sources include weather data, satellite imagery, vegetation reports, and topographical data.
- Data is collected from public and licensed sources, ensuring proper consent and anonymization of any private information (e.g., personal property location).

## Bias Considerations
- **Potential Bias**: Historical data may underreport wildfires in remote or underserved regions (e.g., rural or indigenous lands).
- **Mitigation**: The model is designed to minimize biases by cross-referencing multiple data sources and continuously monitoring the data collection process to ensure it includes diverse geographic regions.

## Fairness & Justice
- The model has been trained to predict wildfires across various geographic regions and forest types. Efforts have been made to avoid disproportionate impacts on vulnerable communities (e.g., rural populations, indigenous lands).
- Special attention is given to reducing false positives (unnecessary evacuations) and false negatives (failure to predict real wildfires), balancing the risk for all stakeholders.

## Privacy and Security
- Satellite data, weather station information, and geographic data are used, with efforts to anonymize any personal information inadvertently captured (e.g., campers, property owners).
- No social media or surveillance data is used without explicit consent.

## Sustainability and Environmental Impact
- The model aims to assist in sustainable land management and wildfire mitigation strategies by improving early prediction and reducing the severity of wildfires.
- It supports long-term environmental sustainability by informing decisions around land use, deforestation, and conservation practices.

## Model Limitations
- The model's accuracy may vary depending on the region and the quality of data available.
- There are limitations to predicting wildfires in regions with insufficient historical data, leading to potential inaccuracies.
- The model is regularly updated to incorporate new climate data and evolving environmental conditions.

## Accountability and Transparency
- The development team will monitor the model for performance over time, ensuring that it adapts to new data and environmental shifts.
- Stakeholders (e.g., firefighting agencies, local governments) will be informed of the model’s limitations, ensuring proper interpretation of the predictions.
- False predictions will be communicated to stakeholders, with a process in place for continuous feedback and model improvement.

## Societal Impact
- The model is designed to protect both human lives and the environment by enabling better planning and response to wildfire threats.
- It has the potential to inform policy changes in land management, conservation, and emergency response strategies.
"""

# Function to display the DataCard
def display_datacard():
    return datacard_content

# Gradio interface to display the ethics DataCard
iface = gr.Interface(fn=display_datacard, inputs=[], outputs="markdown")

# Launch the Gradio interface
iface.launch()


Collecting gradio
  Downloading gradio-5.5.0-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.5-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.4.0-py3-none-any.whl.metadata (2.9 kB)
Collecting gradio-client==1.4.2 (from gradio)
  Downloading gradio_client-1.4.2-py3-none-any.whl.metadata (7.1 kB)
Collecting markupsafe~=2.0 (from gradio)
  Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart==0.0.12 (from gradio)
  Downloading python_multipart-0.0.12-py3-none-any.whl.metadata (1.9 kB)
Collecting ruff>=0.2.2 (from gradio)
  Downloading ruff-0.7.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metad



# **Step 2: Define requirements file**

gradio


# **Step 3: Upload the app.py file and requirements.txt file to huggingface**