Skip to content

TexasBearfan/Project_3_Data_Visualization

Repository files navigation

Project_3_Data_Visualization

The Most Dangerous U.S. National Parks for Adventure Seekers: An Assessment of Number, Cause, and Overall Trends of Deaths in the Parks

Team Members: -Project Group 2: Barrett Fudge Larry Jiles Jr. Carlie Rhoads Gregory Smith

Project Description/Outline: Our project focuses on analyzing data related to deaths in U.S. National Parks with the goal of identifying the most dangerous parks for adventure seekers. We will assess the number, cause, and trend of deaths in these parks from 2007 to 2023. The project falls under the data visualization track.

Dataset to be Used: We will be utilizing the National Park Service (NPS) API to gather data on deaths that occurred in U.S. National Parks. The breakdown and organization of the data are still to be determined as the NPS API offers a wide range of data, and we are currently exploring its depth.

Rough Breakdown of Tasks: Data Processing and Cleaning: We, first, cleaned and organized the data obtained from the NPS API, checking for null and duplicate entries. We then used Pandas to clean and format the datasets. Next we created a notebook documenting the data exploration and cleanup process.

Analysis of Data: Our group analyzed the cleaned data to extract insights into the number, cause, and trend of deaths in U.S. National Parks. We then developed code to perform the analysis, providing explanations for each section.

Visualization: We wanted utilize various visualization techniques such as maps, pie charts, and line graphs to present the analyzed data. We created visualizations to address questions such as the distribution of deaths across parks, top causes of death, and trends in the most dangerous parks over time. The group then implemented interactive features where applicable, such as drop-down menus for park selection.

Presentation Preparation: Our next steps were to develop visually appealing slides for the presentation using Google Slides for group collaboration, ensuring slides are clear, professional, and relevant to the material being presented. We assigned speaking roles for each member and practiced the presentation to maintain audience interest and coherence.

Questions To be Addressed and Visualizations: -Map: Number of deaths in each park, followed by each park’s top manner of death. -Pie Chart: Breakdown of cause of death across the top five national parks. -Line Graph: Trend analysis of the top five most dangerous parks based on the number of deaths from 2007 to 2023.

Brief definition to define how we determine “most dangerous”: The term "most dangerous" is defined based on the total number of deaths recorded in each park over the specified period, with consideration given to the frequency and severity of incidents.

Parks Used: Lake Mead National Recreation Area Grand Canyon National Park Yosemite National Park Natchez Trace Parkway Golden Gate National Recreation Area

Visualization of Data:

Utilization of line graphs, histograms, and Plotly interactive maps with latitude/longitude markers to visualize the analyzed data. Code was developed to create the visualizations, and the results will be presented to provide insights into the trends and patterns observed in the dataset.

Presentation:

This ReadMe provides an overview of our project goals, tasks, and methodologies. We look forward to presenting our findings and insights on the most dangerous U.S. National Parks for adventure seekers

Ethical Considerations:

As we delve into analyzing data related to deaths in U.S. National Parks, it's crucial to acknowledge and address several ethical considerations that arise throughout our project:

-Sensitivity and Respect for Victims: We recognize the sensitive nature of our subject matter, which involves the loss of human lives. It's essential to approach the analysis and presentation of this data with empathy, respect, and sensitivity towards the victims and their families. -Data Privacy and Anonymity: While our analysis focuses on aggregated data, we must ensure the privacy and anonymity of individuals involved in incidents. Any personally identifiable information should be handled with the utmost care and only used for analytical purposes in accordance with applicable data protection laws and regulations. -Avoiding Sensationalism: The portrayal of National Parks as dangerous destinations should be handled cautiously to avoid sensationalizing incidents and potentially discouraging visitors from exploring these natural wonders. We aim to present our findings in an informative and balanced manner, highlighting safety measures and recommendations for park visitors. -Bias and Misinterpretation: We acknowledge the potential for bias in data collection and interpretation, which may skew our analysis. We strive to mitigate bias by critically evaluating our methods and sources, seeking diverse perspectives, and transparently documenting our assumptions and limitations. -Social Responsibility: As data analysts, we have a responsibility to use our findings ethically and responsibly, considering the broader societal implications of our work. This includes advocating for improved safety measures in National Parks, raising awareness about risks associated with outdoor activities, and promoting responsible recreation practices. -Informed Consent and Data Ownership: If our analysis involves user-generated content or personal data obtained from sources such as social media or public forums, we must respect users' rights to informed consent and data ownership. We will only use such data in compliance with relevant terms of service and privacy policies, ensuring transparency and accountability in our data practices. -Continuous Reflection and Improvement: Throughout our project, we commit to engaging in continuous reflection and dialogue regarding the ethical implications of our work. We will actively seek feedback from peers, stakeholders, and domain experts to identify areas for improvement and uphold the highest ethical standards in our data analysis and presentation.

By addressing these ethical considerations proactively, we aim to conduct our project in a responsible and ethical manner, ensuring that our findings contribute positively to the understanding and appreciation of U.S. National Parks while respecting the dignity and privacy of all individuals involved.

New Library used

We utilized two new libraries for our visualizations: Chart.js and jQuery. While Chart.js was initially selected for the pie chart visualization, it did not support the simultaneous creation of a pie chart with a drop-down menu. Consequently, to address this limitation, we incorporated jQuery into our solution.

References for any code used that is not your own

In our project, we are committed to upholding ethical standards, including giving proper credit and attribution for any code used that is not our own. Below are references for external code or libraries that we may utilize in our project:

Leaflet.js: The Leaflet JavaScript library is used for interactive maps. Reference: Leaflet

Chart.js: Chart.js is used for creating interactive charts and graphs. Reference: Chart.js

jQuery: jQuery is used for DOM manipulation and AJAX requests. Reference: jQuery

D3.js: D3.js is used for data visualization and manipulation. Reference: D3.js

Marker Cluster Plugin: The Leaflet.markercluster plugin is used for clustering markers on maps. Reference: Leaflet.markercluster

Pandas: Pandas library is used for data cleaning and manipulation. Reference: Pandas

Plotly: Plotly library is used for creating interactive plots and visualizations. Reference: Plotly

Google Slides API: Google Slides API is used for creating and collaborating on presentation slides. Reference: Google Slides API

Express.js: Express.js is used for building the backend server. Reference: Express.js

Node.js: Node.js is used for server-side JavaScript runtime. Reference: Node.js

We ensure that we comply with the licensing terms and requirements of each referenced library or tool, including providing proper attribution where necessary. Additionally, we strive to understand and adhere to best practices for using third-party code responsibly and ethically.

Source Data:

Chat GPT Provider: OpenAI Model Version: GPT-3.5 Training Data: Diverse internet text Training Duration: Training duration was about 3-4 hours @article{openai2023, author = {OpenAI}, title = {ChatGPT: A Language Model by OpenAI}, year = {2023}, url = {https://www.openai.com}, }

BCS app within Slack app

Classmates

Stackoverflow

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •