# About the Final Project

For this project, you will implement 
- a visualization using your data from Module 1 and 
- preliminary low-fidelity prototypes from Module 2 to address your stated goals. 

You may implement this visualization using either Altair or another platform of your choice. 
- Once implemented, conduct your evaluation based on the plan outlined in your Module 3 discussion post, making sure to conduct your evaluation with at least three people. You may refine any of your prior plan to reflect your evolving understanding of the challenges you are addressing. 
- Be sure to address how your plan has changed from these earlier posts as part of your discussion. 

### Your final project post should include: 

- A brief recap of your data, goals, and tasks, focusing on those that most directly influence your design
- Screenshots of and/or a link to your visualization implementation (see below for additional guidance)
- A summary of the key elements of your design and accompanying justification
- A discussion of your final evaluation approach, including the procedure, people recruited, and results. Note that, due to the difficulty of recruiting experts, you can use colleagues, friends, classmates, or family to evaluate your designs if experts or others from your target population are unavailable.
- A synthesis of your findings, including what elements of your approach worked well and what elements you would refine in future iterations.
  
Guidance and platforms for deploying Altair visualizations online include: 
- https://matthewkudija.com/blog/2018/06/22/altair-interactive/
- https://towardsdatascience.com/add-animated-charts-to-your-dashboards-with-streamlit-python-f41863f1ef7c
- https://towardsdatascience.com/creating-interactive-jupyter-notebooks-and-deployment-on-heroku-using-voila-aa1c115981ca

---

DTSA 5304	Fundamentals of Data Visualization	
- Project	40%	
- Final visualization with write-up

final will use dataset from week 1 - Visualization Project Part 1: Finding your Data


task:
- A full report where you cover what you did, the visualizations you created to find insights including why you chose those particular visualizations and what insights you where able to find.

I took this class last session. Basically when I finished my mockup, I asked a couple of friends for their inputs on it after I explained what I wanted to show in my final project. You basically have to explain what feedback they offered or what they would want to see in the final visualization. Then you would try to work their inputs/suggestions into your visualization and explain how that affected your approach in making the visualization

---

## Visualization Project Part 1: Finding your Data

Although optional, participation in these weekly discussion activities is strongly recommended. Throughout this course you will learn techniques to comprehend clients’ data needs, craft visualization solutions, and evaluate your solutions. Discussion board activities allow you to practice and reflect on these concepts within the context of a discipline or problem area relevant to you. 

For those taking this course as part of the Masters in Data Science, the activities are essential building blocks for the final project. By engaging in these activities/discussions on a weekly basis, you will avoid having to prepare everything for the final project all at once. 

Locate a dataset that you are interested in working with. The data should be sufficiently complex that you can ask lots of questions about it and engage in creative design techniques, but not so complex that you need specialized hardware or algorithmic approaches to analyze. While you are welcome to use any data you’d like, I recommend that your datasets are tabular (e.g., CSV, TSV, SQL, etc.), contain 5,000 or fewer datapoints (on the order of one hundred or so tends to be sufficiently interesting without causing lag in Altair), and is data that you’re comfortable discussing as part of the course (e.g., avoid data that is overly private or classified). 

Discuss your dataset, including the data’s source, key attributes/dimensions of the data, and your goals for working with that data (i.e., what are the key questions you want to answer). Identify existing relevant visualizations for working with that data (either using the same data, showing the same concepts, or just that might provide some inspiration) and critique those visualizations based on the practices from this module. What works well? What might need improvement or to change to answer your target questions? 

## Answer:

I will be using the Titanic Dataset, this dataset contains information about passengers aboard the Titanic, including whether they survived or not, as well as various attributes such as age, gender, ticket class, etc.

Data Source: The Titanic dataset is widely available and can be obtained directly from seaborn library (https://github.com/mwaskom/seaborn-data/blob/master/titanic.csv).

---

key Attributes/Dimensions of the Data:

- PassengerId: A unique identifier for each passenger.
- Survived: Whether the passenger survived or not (0 = No, 1 = Yes).
- Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd).
- Name: Passenger's name.
- Sex: Passenger's gender.
- Age: Passenger's age.
- SibSp: Number of siblings/spouses aboard the Titanic.
- Parch: Number of parents/children aboard the Titanic.
- Ticket: Ticket number.
- Fare: Passenger fare.
- Cabin: Cabin number.
- Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).

Goals for working with the data:

- Explore the demographics of passengers aboard the Titanic.
- Analyze factors affecting survival rates, such as gender, age, ticket class, etc.
- Visualize relationships between different attributes to identify patterns and insights.

Existing Relevant Visualizations:

- Survival Rate by Gender: Bar chart showing the count of survivors by gender.
- Survival Rate by Ticket Class: Bar chart showing the count of survivors by ticket class.
- Age Distribution: Histogram showing the distribution of passenger ages.
- Survival Rate by Age Group: Bar chart or box plot showing survival rates by age group.
- Survival Rate by Embarkation Port: Pie chart or bar chart showing survival rates by embarkation port.

Critique of Existing Visualizations:

- Survival Rate by Gender: This visualization effectively shows the difference in survival rates between genders, but it could be enhanced by adding percentages or relative proportions to better compare survival rates.
- Survival Rate by Ticket Class: Similar to the gender visualization, this one could benefit from adding percentages or relative proportions to the bars for better comparison.
- Age Distribution: This histogram provides a good overview of the age distribution of passengers, but it could be improved by adding labels and a title to make it more informative.
- Survival Rate by Age Group: A bar chart or box plot showing survival rates by age group could provide a clearer understanding of how age correlates with survival. However, it's important to ensure that age groups are properly defined and labeled.
- Survival Rate by Embarkation Port: While a pie chart or bar chart can show survival rates by embarkation port, it's essential to consider whether this information is relevant to our goals and whether there are better ways to visualize the data.

Overall, while these visualizations provide valuable insights into different aspects of the Titanic dataset, there is room for improvement in terms of clarity, labeling, and contextual relevance to our analysis goals. We'll aim to create visualizations that address these aspects as we explore the dataset further.


---

## Visualization Project Part 2: Sketching your Data

Your Module 1 discussion post identified some high-level goals for working with a dataset of interest to you. In this post, you will expand on those goals to characterize your target problem and develop some low-fidelity prototypes for working with that data. First, identify two to three tasks you would wish to complete with your data, identifying: 

    Why is a task pursued? (goal)

    How is a task conducted? (means)

    What does a task seek to learn about the data? (characteristics)

    Where does the task operate? (target data)

    When is the task performed? (workflow)

    Who is executing the task? (roles)

Then, sketch a set of preliminary low-fidelity prototypes for addressing these tasks with the given data. You may either sketch freeform or use the Five Design Sheets approach to generate these prototypes (hand-sketched on paper is fine). Upload a copy of your sketches as part of your post. 

---

## Answer:

**Task 1: Explore Demographics of Passengers**

- **Goal**: Understand the distribution of passengers by age, gender, and ticket class.
- **Means**: Conducted through exploratory data analysis (EDA) using histograms, bar charts, and scatter plots.
- **Characteristics**: Seeks to learn about the composition of passengers aboard the Titanic, including age distribution, gender balance, and distribution across ticket classes.
- **Target Data**: Titanic dataset, focusing on attributes such as Age, Gender, and Pclass (ticket class).
- **Workflow**: Typically performed at the beginning of the analysis to gain an overall understanding of the dataset.
- **Roles**: Data analysts or researchers responsible for understanding the dataset.

**Task 2: Analyze Factors Affecting Survival Rates**

- **Goal**: Investigate how different factors such as gender, age, and ticket class correlate with survival rates.
- **Means**: Conducted through data visualization techniques such as bar charts, box plots, and heatmaps.
- **Characteristics**: Seeks to learn about the relationships between various attributes and survival rates, identifying factors that may have influenced survival outcomes.
- **Target Data**: Titanic dataset, focusing on attributes such as Survived, Age, Gender, and Pclass.
- **Workflow**: Typically performed after exploring demographics to delve deeper into factors influencing survival.
- **Roles**: Data analysts or researchers responsible for understanding patterns and correlations in the data.

**Sketches**:

1. Sketch for Task 1: Explore Demographics
   - Hand-drawn histogram showing the distribution of ages among passengers.
   - Hand-drawn bar chart showing the count of male and female passengers.
   - Hand-drawn bar chart showing the count of passengers in each ticket class.

2. Sketch for Task 2: Analyze Factors Affecting Survival
   - Hand-drawn bar chart comparing survival rates between male and female passengers.
   - Hand-drawn box plot comparing survival rates across different age groups.
   - Hand-drawn heatmap showing survival rates based on both gender and ticket class.

These sketches provide a basic visual representation of the proposed tasks and how they can be addressed using low-fidelity prototypes. They serve as a starting point for further refinement and development of more detailed visualizations during the data analysis process.

---

## Visualization Project Part 3: A Plan for Evaluation

In your previous post, you identified a series of tasks and goals for your visualization as well as some preliminary design ideas. We’ll jump ahead a few steps and start to think about how we might evaluate our design approach. Outline a preliminary evaluation that addresses your core goals with the visualization. Make sure your evaluation discusses: 

    The target question you want to answer

    The people you would recruit to answer that question

    The kinds of measures you would use to answer your data (e.g., insight depth, use cases, accuracy) and what these measures would tell you about the core question

    The approach you will use to answer that question (e.g., a journaling study, a formal experiment, etc.)

    How you would instantiate those methods (i.e., what would your participants do?)

    What criteria would you use to indicate that your visualization was successful

---

## Answer:

---

# wk 3

In your previous post, you identified a series of tasks and goals for your visualization as well as some preliminary design ideas. We’ll jump ahead a few steps and start to think about how we might evaluate our design approach. Outline a preliminary evaluation that addresses your core goals with the visualization. Make sure your evaluation discusses: 

- The target question you want to answer
    - What factors influenced the survival rate of passengers on the Titanic?

- The people you would recruit to answer that question
    - Individuals with some familiarity with the Titanic disaster or historical events, interested in exploring data visually (e.g., students, history enthusiasts).
      
- The kinds of measures you would use to answer your data (e.g., insight depth, use cases, accuracy) and what these measures would tell you about the core question
    - The kinds of measures you would use to answer your data (e.g., insight depth, use cases, accuracy) and what these measures would tell you about the core question: Insight depth, use cases, accuracy. These measures would indicate the depth of understanding gained about factors affecting survival rates, how participants interact with the visualization to explore different aspects of the data, and the accuracy of interpretations made by participants compared to actual historical data.
    
- The approach you will use to answer that question (e.g., a journaling study, a formal experiment, etc.)
    - A mixed-methods approach combining qualitative (journaling study) and quantitative (formal experiment) techniques.

- How you would instantiate those methods (i.e., what would your participants do?)
    - Participants would interact with an interactive visualization tool, journaling their thoughts and observations while exploring the data and performing specific tasks designed to assess their understanding and interpretation.

- What criteria would you use to indicate that your visualization was successful
    -  Informed interpretations, effective use of visualization, accurate insights, and user satisfaction. These criteria would demonstrate the depth of understanding gained, the usability and functionality of the visualization tool, alignment with historical facts, and participant satisfaction with the experience.

