# DS4200 Homework 3

Due: Friday Oct 31st @ 11:59 PM EST

### Submission Instructions
See the instruction for each question

### Tips for success
- Start early
- Make use of Piazza
- Make use of Office hour
- Remember to use cells and headings to make the notebook easy to read (if a grader cannot find the answer to a problem, you will receive no points for it)
- Under no circumstances may one student view or share their ungraded homework or quiz with another student [(see also)](http://www.northeastern.edu/osccr/academic-integrity), though you are welcome to **talk about** (not show each other) the problems.

## Part 1: Make your website (30 points)

In this assignment, you will create a personal website using GitHub Pages. The website will introduce you or showcase something you’re passionate about—whether it's a hobby, academic interest, or a project. This assignment will help you become familiar with GitHub Pages, HTML, CSS, and how to deploy your website online.

Requirements:

+ Website Structure (15 points): Your website must consist of at least two pages. You are free to structure the site based on the content you wish to highlight, but the following sections are required:
    + Home Page (index.html) (5 points):
        + A clear heading (e.g., "Welcome to My Website" or "About Me").
        + A short introduction about yourself or the topic you are interested in. (At least 200 words)
        + At least one image (e.g., your photo or something that represents your interests).
    + A second page (or more) that expands on the details. This could include (5 points):
        + Descriptive text explaining the content (At least 200 words)
        + At least one image (e.g., your photo or something that represents your interests).
        + A list (ordered or unordered) about the topic
        + A heading smaller than the one on the Home Page
    + Contact Information: Include a contact section (which could be on any of your pages) with links to your email or social media accounts.(2 points)
    + Navigation: Your site should have clear navigation between pages, either via a header menu or links, so visitors can move seamlessly between the Home and Additional Page(s). (3 points)
+ Styling & Customization (10 points): Style your website using custom CSS. You must implement three different customizations beyond basic styling. These could include:
    + Custom fonts or typography.
    + Color scheme changes for backgrounds, text, or links.
    + Custom margins, padding, or layout adjustments.
+ Deployment (5 points): Once your website is complete, ensure it is published on GitHub Pages. Double-check that all links and pages work properly before submitting. 

## Submission

Once you finish all the questions. Submit all the files related to the website to Gradescope (you can submit the repository directly to Gradescope). Also, please submit a Word document or a .txt file which contains:
+ The hyper link of your website
+ Decribe all the Styling & Customizations you have done in the Website.

Note, if you believe that you have a person website that happens to fit all requirement in the assignment, please email to the instructor for approval. 

## Part 2: D3 basic plots (70 points)

In this question, we provide a CSV file with the Social Media data. It is a subset data from Kaggle. You need to make three plots by filling the given templete. When submit the homework, please submit both .html and .js. 

### Part 2.1 Side-by-side boxplot (30 points)

Use the Social Media data, make a side by side boxplot to show the distribution of the number of Likes (`Likes`) across three age group (`AgeGroup`). To make things easier, we can ignore the outliers first. 

Here is a general approch here: 

1. In the templete, we provide you a way to read the csv file. Once the data is read into d3. All the inputs are considered as strings. Therefore, the first thing we need to do is to convert the data to numeric type. The code is also provided.
2. Setup the SVG canvas, scales and add the scales to the canvas and also add labels for the scales. (5 points) 
3. In order to make a boxplot, we need to calculate some basic metrics for the data. For each species, we need to calcualte the q1, median and q3. We first define a function called `rollupFunction` to list all the variables we need to calculate. Follow the example for q1 to setup for median and q3, or any other values you need. (5 points)
4. Add comments for the following two lines (add in the .js file) to explain what those codes are doing. (5 points) 
    
    ```js
    const quantilesByGroups = d3.rollup(data, rollupFunction, d => d.AgeGroup);

    quantilesByGroups.forEach((quantiles, AgeGroup) => {
        const x = xScale(AgeGroup);
        const boxWidth = xScale.bandwidth();
    }
    ```
5. Inside the `.forEach` function, draw the boxes. There are three things you need to draw for the box plot: 
    - The vertical line in the middle from the q1-1.5 * IQR to q3+1.5 * IQR. However, in this particular question, since this range is quite large, you can just plot a line from the minimum value to the maximum value for each group.  (5 points)
    - The rectangular shape from q1 to q3. You can add some color (the same color as the background) to hide the vertical line in the back.  (5 points)
    - The horizental line for median (5 points)



In [1]:
// Load the data
const socialMedia = d3.csv("socialMedia.csv");

// Once the data is loaded, proceed with plotting
socialMedia.then(function(data) {
    // Convert string values to numbers
    data.forEach(function(d) {
        d.Likes = +d.Likes;
    });

    // Define the dimensions and margins for the SVG
    const margin = { top: 20, right: 30, bottom: 40, left: 60 };
    const width = 600 - margin.left - margin.right;
    const height = 400 - margin.top - margin.bottom;
 
    // Create the SVG container
    const svg = d3.select("#boxplot")
        .append("svg")
        .attr("width", width + margin.left + margin.right)
        .attr("height", height + margin.top + margin.bottom)
        .append("g")
        .attr("transform", `translate(${margin.left},${margin.top})`);    

    // Add scales     
    const xScale = d3.scaleBand()
        .domain([...new Set(data.map(d => d.AgeGroup))])
        .range([0, width])
        .padding(0.2);

    const yScale = d3.scaleLinear()
        .domain([0, d3.max(data, d => d.Likes)])
        .range([height, 0]);

    // Add x-axis/y-axis label & axes
    svg.append("g")
        .attr("transform", `translate(0,${height})`)
        .call(d3.axisBottom(xScale));

    svg.append("g")
        .call(d3.axisLeft(yScale));

    svg.append("text")
        .attr("x", width / 2)
        .attr("y", height + margin.bottom - 5)
        .style("text-anchor", "middle")
        .text("Age Group");

    svg.append("text")
        .attr("transform", "rotate(-90)")
        .attr("x", -height / 2)
        .attr("y", -margin.left + 15)
        .style("text-anchor", "middle")
        .text("Number of Likes");

    
    // Rollup Function
    const rollupFunction = function(groupData) {
      const values = groupData.map(d => d.Likes).sort(d3.ascending);
      const q1 = d3.quantile(values, 0.25);
      const median = d3.quantile(values, 0.5);
      const q3 = d3.quantile(values, 0.75);
      const iqr = q3 - q1;
      const min = d3.min(values);
      const max = d3.max(values);
      return { q1, median, q3, iqr, min, max };
  };

    // This code groups the dataset by 'AgeGroup' and applies 'rollupFunction' to each group,
    // creating a Map where each key is an AgeGroup and each value is the computed result
    const quantilesByGroups = d3.rollup(data, rollupFunction, d => d.AgeGroup);

    // Iterates over each AgeGroup in the Map 'quantilesByGroups', retrieving its quantile values,
    // and computes the x-position and box width for each group using the xScale for to plot
    quantilesByGroups.forEach((quantiles, AgeGroup) => {
        const x = xScale(AgeGroup);
        const boxWidth = xScale.bandwidth();

        // Draw vertical lines
        svg.append("line")
            .attr("x1", x + boxWidth / 2)
            .attr("x2", x + boxWidth / 2)
            .attr("y1", yScale(quantiles.min))
            .attr("y2", yScale(quantiles.max))
            .attr("stroke", "black");

        // Draw box
        svg.append("rect")
            .attr("x", x)
            .attr("y", yScale(quantiles.q3))
            .attr("width", boxWidth)
            .attr("height", yScale(quantiles.q1) - yScale(quantiles.q3))
            .attr("fill", "#aad8d3")
            .attr("stroke", "black");

        // Draw median line
        svg.append("line")
            .attr("x1", x)
            .attr("x2", x + boxWidth)
            .attr("y1", yScale(quantiles.median))
            .attr("y2", yScale(quantiles.median))
            .attr("stroke", "red")
            .attr("stroke-width", 2);
    });
});

// Prepare you data and load the data again. 
// This data should contains three columns, platform, post type and average number of likes. 
const socialMediaAvg = d3.csv("SocialMediaAvg.csv");

socialMediaAvg.then(function(data) {
    // Convert string values to numbers
    data.forEach(d => {
        d.AvgLikes = +d.AvgLikes; 
    });
    
    // Define the dimensions and margins for the SVG
    const margin = { top: 20, right: 150, bottom: 50, left: 60 };
    const width  = 700 - margin.left - margin.right;
    const height = 420 - margin.top - margin.bottom;

    // Create the SVG container
    const svg = d3.select("#barplot")
        .append("svg")
        .attr("width",  width  + margin.left + margin.right)
        .attr("height", height + margin.top  + margin.bottom)
        .append("g")
        .attr("transform", `translate(${margin.left},${margin.top})`);

    // Define four scales
    // Scale x0 is for the platform, which divide the whole scale into parts
    // Scale x1 is for the post type, which divide each bandwidth of the previous x0 scale into three part for each post type
    // Recommend to add more spaces for the y scale for the legend
    // Also need a color scale for the post type

    const platforms = [...new Set(data.map(d => d.Platform))];
    const x0 = d3.scaleBand()
        .domain(platforms)
        .range([0, width])
        .paddingInner(0.2);
      
    const postTypes = [...new Set(data.map(d => d.PostType))];
    const x1 = d3.scaleBand()
        .domain(postTypes)
        .range([0, x0.bandwidth()])
        .padding(0.1);
      
    const y = d3.scaleLinear()
        .domain([0, d3.max(data, d => d.AvgLikes) * 1.2])
        .nice()
        .range([height, 0]);
      
    const color = d3.scaleOrdinal()
      .domain(postTypes)
      .range(["#1f77b4", "#ff7f0e", "#2ca02c"]);    
  
    // Add scales x0 and y     
    svg.append("g")
        .attr("transform", `translate(0,${height})`)
        .call(d3.axisBottom(x0));

    svg.append("g")
        .call(d3.axisLeft(y));

    // Add x-axis label
    svg.append("text")
        .attr("x", width / 2)
        .attr("y", height + 40)
        .style("text-anchor", "middle")
        .text("Platform");

    // Add y-axis label
    svg.append("text")
        .attr("transform", "rotate(-90)")
        .attr("x", -height / 2)
        .attr("y", -margin.left + 15)
        .style("text-anchor", "middle")
        .text("Average Likes");

    // Group container for bars
    const platformGroups = svg.selectAll(".platform-group")
        .data(platforms)
        .enter()
        .append("g")
        .attr("class", "platform-group")
        .attr("transform", d => `translate(${x0(d)},0)`);

    // Draw bars
    platformGroups.selectAll("rect")
        .data(platform => data.filter(d => d.Platform === platform))
        .enter()
        .append("rect")
        .attr("x", d => x1(d.PostType))
        .attr("y", d => y(d.AvgLikes))
        .attr("width",  x1.bandwidth())
        .attr("height", d => height - y(d.AvgLikes))
        .attr("fill",   d => color(d.PostType))
        .append("title") 
        .text(d => `${d.Platform} - ${d.PostType}: ${d.AvgLikes.toFixed(2)} Likes`);

    // Add the legend
    const legend = svg.append("g")
        .attr("transform", `translate(${width + 20}, 10)`);

    const types = [...new Set(data.map(d => d.PostType))];
 
    types.forEach((type, i) => {
        // Already have the text information for the legend. 
        // Now add a small square/rect bar next to the text with different color.
        legend.append("rect")
            .attr("x", 0)
            .attr("y", i * 22)
            .attr("width", 14)
            .attr("height", 14)
            .attr("fill", color(type));

        legend.append("text")
            .attr("x", 20)
            .attr("y", i * 22 + 11)
            .attr("alignment-baseline", "middle")
            .text(type);
    });
});


// Prepare you data and load the data again. 
// This data should contains two columns, date (3/1-3/7) and average number of likes. 

const socialMediaTime = d3.csv("SocialMediaTime.csv");

socialMediaTime.then(function(data) {
    // Convert string values to numbers
    const parseDate = d3.timeParse("%m/%d/%Y (%A)");
    const formatDate = d3.timeFormat("%B %-d"); 

    data.forEach(d => {
        d.Date = parseDate(d.Date);
        d.AvgLikes = +d.AvgLikes;
    });

    // Define the dimensions and margins for the SVG
    const margin = { top: 40, right: 30, bottom: 70, left: 70 };
    const width = 800 - margin.left - margin.right;
    const height = 420 - margin.top - margin.bottom;


    // Create the SVG container
    const svg = d3.select("#lineplot")
        .append("svg")
        .attr("width", width + margin.left + margin.right)
        .attr("height", height + margin.top + margin.bottom)
        .append("g")
        .attr("transform", `translate(${margin.left},${margin.top})`);


    // Set up scales for x and y axes
    const x = d3.scaleTime()
        .domain(d3.extent(data, d => d.Date))
        .range([0, width]);


    const y = d3.scaleLinear()
        .domain([0, d3.max(data, d => d.AvgLikes)])
        .nice()
        .range([height, 0]);  

    // Draw the axis, you can rotate the text in the x-axis here
    const xAxis = d3.axisBottom(x)
        .ticks(data.length)
        .tickFormat(formatDate);


    const yAxis = d3.axisLeft(y).ticks(6);
    
    svg.append("g")
        .attr("transform", `translate(0,${height})`)
        .call(xAxis)
        .selectAll("text")
        .style("text-anchor", "end")
        .attr("transform", "rotate(-25)");
    
    
    svg.append("g").call(yAxis);
    
    
    // Add x-axis label
    svg.append("text")
        .attr("x", width / 2)
        .attr("y", height + 55)
        .attr("text-anchor", "middle")
        .text("Date");
    
    
    // Add y-axis label
    svg.append("text")
        .attr("transform", "rotate(-90)")
        .attr("x", -height / 2)
        .attr("y", -50)
        .attr("text-anchor", "middle")
        .text("Average Likes");
    
    
    // Draw the line and path using curveNatural
    const line = d3.line()
        .x(d => x(d.Date))
        .y(d => y(d.AvgLikes))
        .curve(d3.curveNatural);
    
    
    svg.append("path")
        .datum(data)
        .attr("fill", "none")
        .attr("stroke", "steelblue")
        .attr("stroke-width", 2)
        .attr("d", line);
});



SyntaxError: invalid syntax (3142660740.py, line 1)

### Part 2.2 Side-by-side bar plot (30 points)

Use the Social Media data, make a side-by-side bar plot to show the relationship between `Platform`, `PostType` and average number of `Likes`. Use differnt color for each post type. Before you load the data into JS, you need to clean and get a summerized version of data like: 

|   |  Platform | PostType | AvgLikes |
|--:|----------:|---------:|---------:|
| 0 |  Facebook |    Image |   555.89 |
| 1 |  Facebook |     Link |   468.69 |
| 2 |  Facebook |    Video |   505.00 |
| 3 | Instagram |    Image |   502.64 |
| 4 | Instagram |     Link |   459.34 |

Each row is a unique combination of platform and post type with average number of likes. Keep 2 decimal will be enough. Name the dataset as `SocialMediaAvg.csv` and keep the feature names as in the example above. Note: This step can also be done with D3 but it is more complicated than using Python or any other data analysis tools. 

Here is a general approch here: 

1. Load the data you just made. First convert the strings into numeric data as we did in previous question. Setup the SVG canvas. (5 points)
2. Set up scales. You need to define four scales. Scale x0 is for the platform, which divide the whole x scale into 4 parts. Scale x1 is for the post type, which divide each bandwidth of the previous x0 scale into three part for each post type. Scale y is numerical for the average number of likes. The color scale for the post type is provided. (10 points)
3. Add scales and labels to the plot. (5 points)
4. We group the data for each platform (code provided). Add Rect for each post type with color. (5 points)
5. Complete the legend. The legend has two parts. The rect and the text. The text information for the legend is provided. Now add a small square/rect bar next to the text with corresponding color. (5 points)

### Part 2.3 Line plot (10 points)

Use the Social Media data, make a line plot to show the relationship between Date and average number of Likes. Similar to the part 2.2, before you load the data into JS, you need to clean and get a summerized version of data like: 

|   |                 Date |   AvgLikes |
|--:|---------------------:|-----------:|
| 0 |    3/1/2024 (Friday) | 478.504950 |
| 1 |  3/2/2024 (Saturday) | 477.986842 |
| 2 |    3/3/2024 (Sunday) | 542.420455 |
| 3 |    3/4/2024 (Monday) | 523.585106 |
| 4 |   3/5/2024 (Tuesday) | 423.258427 |
| 5 | 3/6/2024 (Wednesday) | 552.942529 |
| 6 |  3/7/2024 (Thursday) | 485.793478 |

Name the dataset as `SocialMediaTime.csv` and keep the feature names as in the example above. Follow the general approch in the templete. 

1. Load the data you just made. First convert the strings into numeric data as we did in previous question. Setup the SVG canvas, scales and add the scales to the canvas and also add labels for the scales. (5 points) 
2. Make the line and path. Remember to use curveNatural. (5 points)

Note, if the words are too long to show, you can include

    .style("text-anchor", "end")
    .attr("transform", "rotate(-25)")
    
when you add the x-axis to add some rotation for the x axis labels. )

## Submission

Once you finish all the questions, submit the .html and .js for the D3 part to Gradescope. No need to include any data. You can choose to upload all your codes with a folder on GitHub repository in the part 1 and submit the repository. Or, you just zip all the files with different folders to specify different parts.