# QTM 350 - Data Science Computing
## Quarto Practice - Building & Publishing a Titanic Insights Website
**Author:** Danilo Freire (danilo.freire@emory.edu, Emory University)

# Titanic data using Quarto & GitHub Pages 🚢

# Time to build and publish with Quarto! 📝

## Hands-on Practice

- **Objective:** In this session, you'll create a small Quarto **website** with multiple pages. This website will showcase data analysis and visualisations using the Titanic dataset. You will then publish this website live on the internet using GitHub Pages.
- This practice session focuses on:
  - Creating and structuring a Quarto website project.
  - Developing content for different pages (text, code, visualizations).
  - Configuring website navigation.
  - Using Git for version control.
  - Publishing your work via GitHub Pages.
- **Estimated Time:** 60-75 minutes (publishing can take a few extra minutes for GitHub to build).

- Feel free to use any resources you have available (lecture notes, Quarto documentation). 
- If you get stuck, please ask for help! Searching the web (e.g., for specific Quarto options or Git commands) is also a good idea!

## Part 1: Creating Your Quarto Website Project

1.  **Create a Project Directory (Local Machine):**
    * Open your terminal or command prompt.
    * Create a new directory for this project and navigate into it:
    ```sh
    mkdir titanic_quarto_website
    cd titanic_quarto_website
    ```

2.  **Create a New Quarto Website Project:**
    * Inside the `titanic_quarto_website` directory, run the Quarto command to create a new website project. Quarto will ask for the project directory; use `.` for the current directory.
    ```sh
    quarto create-project --type website .
    ```
    *This will generate a few files and folders, including `_quarto.yml`, `index.qmd`, `about.qmd`, and `styles.css`.* 

3.  **Obtain the Titanic Dataset:**
    * Create a `data` subdirectory within your project for the dataset:
    ```sh
    mkdir data
    ```
    * Create `data/titanic.csv` using `seaborn` (run this Python code in a temporary script):
    ```python
    import seaborn as sns
    import pandas as pd
    titanic_df = sns.load_dataset('titanic')
    titanic_df.to_csv('data/titanic.csv', index=False)
    print("'data/titanic.csv' created successfully!")
    ```


## Part 2: Configuring Your Website (`_quarto.yml`)

1.  **Edit `_quarto.yml`:**
    * Open the `_quarto.yml` file in your editor.
    * This file controls your website's overall structure and appearance.
    * Modify it to look something like this (you can change the title and theme):
    ```yaml
    project:
      type: website
      output-dir: docs # We'll render the website to a 'docs' folder for GitHub Pages

    website:
      title: "Titanic Data"
      navbar:
        left:
          - href: index.qmd
            text: Home
          - href: pclass_analysis.qmd
            text: Survival by Class
          - href: age_fare_analysis.qmd
            text: Age vs. Fare
          - about.qmd # Keep the about page for now
      page-footer: "Built with Quarto by Your Name"

    format:
      html:
        theme: cosmo # Try other themes: sketchy, darkly, lumen, sandstone, etc.
        css: styles.css
        toc: true
        code-fold: true
    ```

## Part 3: Creating Content - Index Page (`index.qmd`)

1.  **Edit `index.qmd`:**
    * Open `index.qmd`. This is your website's homepage.
    * Modify its content. Here's an example:
    ```markdown
    ---
    title: "Titanic Insights"
    ---

    ## Exploring the Titanic Dataset

    This website presents a brief analysis of the passenger data from the RMS Titanic. The analysis includes:

    - Survival rates based on passenger class.
    - The relationship between passenger age and the fare they paid.

    Use the navigation bar above to explore the different analyses.

    The data used is the well-known Titanic dataset, often used for introductory data science tasks.

    ![](https://www.science.smith.edu/climatelit/wp-content/uploads/sites/97/2024/07/GettyImages-517357578-5c4a27edc9e77c0001ccf77d-Large.jpeg)
    *Source: Smith College*
    ```
    *Feel free to customise the text and add a relevant image if you like (ensure you have rights or use public domain images).* You can save an image to your project directory (e.g., in an `images` folder) and link to it like `![](images/titanic.jpg)`.

## Part 4: Creating Content - Analysis Pages

Now, create the two analysis pages we defined in `_quarto.yml`.

### 4.1. `pclass_analysis.qmd` (Survival by Passenger Class)

1.  **Create `pclass_analysis.qmd`:**
    * In your project root, create a new file named `pclass_analysis.qmd`.
2.  **Add Content:**
    ```markdown
    ---
    title: "Survival Rate by Passenger Class"
    ---

    This page analyzes the survival rate of Titanic passengers based on their socio-economic class (Pclass).

    First, let's load the necessary libraries and the dataset.

    ```{python} 
    #| label: setup-libs-data-pclass
    #| echo: true
    #| eval: true
    #| message: false
    #| warning: false

    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns

    titanic_df = pd.read_csv('data/titanic.csv')
    ```

    ### Calculating and Visualising Survival Rates

    ```{python}
    #| label: pclass-survival-plot
    #| echo: true
    #| eval: true
    #| fig-cap: "Survival Rate by Passenger Class"

    # Calculate survival rate by Pclass
    pclass_survival_rate = titanic_df.groupby('pclass')['survived'].mean().reset_index()
    print("Survival rate by Pclass:")
    print(pclass_survival_rate)

    # Create bar plot
    plt.figure(figsize=(8, 5))
    sns.barplot(x='pclass', y='survived', data=pclass_survival_rate, palette='viridis', hue='pclass', dodge=False, legend=False)
    plt.title('Survival Rate by Passenger Class')
    plt.xlabel('Passenger Class')
    plt.ylabel('Survival Rate')
    plt.xticks(ticks=[0,1,2], labels=['1st Class', '2nd Class', '3rd Class'])
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.show()
    ```

    #### Interpretation
    The bar chart shows that passengers in first class had a higher survival rate compared to those in second and third class. Third class passengers had the lowest survival rate.
    ```

### 4.2. `age_fare_analysis.qmd` (Age vs. Fare)

1.  **Create `age_fare_analysis.qmd`:**
    * Create another new file in your project root named `age_fare_analysis.qmd`.
2.  **Add Content:**
    ```markdown
    ---
    title: "Age vs. Fare Analysis"
    ---

    This page explores the relationship between passenger age and the fare they paid.

    ```{python}
    #| label: setup-libs-data-agefare
    #| echo: true 
    #| eval: true
    #| message: false
    #| warning: false

    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns

    titanic_df = pd.read_csv('data/titanic.csv')
    ```

    ### Scatter Plot: Age vs. Fare
    For the following scatter plot, passengers with unknown age will be excluded to avoid errors in visualization.

    ```{python}
    #| label: age-fare-scatter-plot
    #| echo: true
    #| eval: true
    #| fig-cap: "Scatter Plot of Age vs. Fare, Coloured by Survival Status"

    # Create a temporary DataFrame excluding rows with missing 'age'
    titanic_age_fare = titanic_df.dropna(subset=['age'])
    print(f"Number of passengers with known age: {len(titanic_age_fare)}")

    plt.figure(figsize=(10, 6))
    sns.scatterplot(x='age', y='fare', hue='survived', data=titanic_age_fare, alpha=0.7, palette={0: '#377eb8', 1: '#ff7f00'})
    plt.title('Age vs. Fare of Titanic Passengers (Coloured by Survival)')
    plt.xlabel('Age (Years)')
    plt.ylabel('Fare Paid')
    plt.legend(title='Survived (0=No, 1=Yes)')
    plt.grid(True, linestyle='--', alpha=0.7)
    plt.show()
    ```

    #### Interpretation
    The scatter plot shows the distribution of passengers by age and fare. While there isn't a very strong linear relationship, we can observe that some younger passengers paid very high fares. 
    ```

## Part 5: Rendering and Previewing Your Website

1.  **Render the Entire Website:**
    * Save all your `.qmd` and `_quarto.yml` files.
    * In your terminal (at the root of `titanic_quarto_website`), run:
    ```sh
    quarto render
    ```
    * This command will process all your `.qmd` files and build the website in the `docs` folder (as specified in `_quarto.yml`). Look for errors!

2.  **Preview Locally (Recommended):**
    * You can preview your website before publishing. Use:
    ```sh
    quarto preview
    ```
    * This will usually open the website in your default web browser. Navigate through your pages to ensure everything looks correct.
    * Press `Ctrl+C` in the terminal to stop the preview server.

## Part 6: Publishing to GitHub Pages

Now, let's get your website online!

1.  **Initialise a Git Repository:**
    * If you haven't already, initialise Git in your project directory:
    ```sh
    git init
    git branch -M main # Ensure your default branch is 'main'
    ```

2.  **Create a `.gitignore` file:**
    * Create a file named `.gitignore` in your project root and add common files/folders to ignore:
    ```gitignore
    # Quarto specific
    /.quarto/
    /_freeze/

    # Python specific
    __pycache__/
    *.py[cod]
    *.egg-info/
    *.so
    venv/
    env/
    .env

    # OS specific
    .DS_Store
    Thumbs.db
    ```

3.  **Commit Your Website Files:**
    * Add all your project files (including the `docs` folder with the rendered website) to Git and make your first commit:
    ```sh
    git add .
    git commit -m "Add Titanic Quarto website"
    ```

4.  **Create a New Repository on GitHub:**
    * Go to [GitHub](https://github.com) and create a **new, empty public repository**.
    * Name it something like `titanic-quarto-insights` or `my-titanic-website`.
    * **Do not** initialise it with a README, .gitignore, or license on GitHub (since you've already created these locally).
    * Copy the HTTPS or SSH URL for your new repository (e.g., `https://github.com/YOUR_USERNAME/YOUR_REPONAME.git`).

5.  **Connect Local Repo to GitHub and Push:**
    * Back in your terminal, link your local repository to the remote one on GitHub (replace `YOUR_GITHUB_REPO_URL`):
    ```sh
    git remote add origin YOUR_GITHUB_REPO_URL
    ```
    * Push your `main` branch to GitHub:
    ```sh
    git push -u origin main
    ```

6.  **Configure GitHub Pages:**
    * Go to your new repository on GitHub.
    * Click on the "Settings" tab.
    * In the left sidebar, click on "Pages".
    * Under "Build and deployment", for the "Source", select "Deploy from a branch".
    * Under "Branch", select `main` and for the folder, select `/docs`.
    * Click "Save".

7.  **Access Your Live Website:**
    * GitHub Pages will now build and deploy your website. This might take a minute or two.
    * Once deployed, GitHub will provide a URL for your live site (usually in the format `https://YOUR_USERNAME.github.io/YOUR_REPONAME/`).
    * Visit the URL to see your Titanic Insights website live on the internet!
    * *Note: Sometimes it takes a few minutes for the site to become active or for changes to propagate.*

## Part 7: Bonus - Create a Presentation Slide (Optional)

If you have extra time, you can quickly turn one of your analysis pages (e.g., `pclass_analysis.qmd`) into a `revealjs` presentation.

1.  **Modify YAML in `pclass_analysis.qmd` (or a copy, which is recommended):**
    * Add `revealjs` to the format options in the YAML header of `pclass_analysis.qmd`:
    ```yaml
    ---
    title: "Survival Rate by Passenger Class"
    format:
      revealjs:
        slide-level: 3 # H3 headings become slides
    ---
    ```
2.  **Render that specific file to `revealjs`:**
    ```sh
    quarto render pclass_analysis.qmd --to revealjs
    ```
    * This will create `pclass_analysis.html` as a presentation. You could then link to this from your main website or upload it separately.

## End of Practice Session

- **Congratulations!** 🥳 
- You've created a multi-page Quarto website, performed data analysis, and published it using GitHub Pages.
- If you have any questions, please feel free to ask!

# And that's all for today! 🎉