# Interactive Data Visualization
##### (C) 2023-2025 Timothy James Becker: [revision 1.0](),  [GPLv3 license](https://www.gnu.org/licenses/gpl-3.0.html)

In [1]:
%%html
<div id="plot1"></div>
<script type="module"> 
    import * as d3 from "https://cdn.skypack.dev/d3@7";   
    d3.csv("https://raw.githubusercontent.com/holtzy/D3-graph-gallery/master/DATA/iris.csv")
        .then(function(data) {
            var width = 600
            var height = 300
            var margin = 60 
            var xScale = d3.scaleLinear().range([margin , width - margin]).domain(d3.extent(data, (d,i) => d.Sepal_Length))
            var yScale = d3.scaleLinear().range([height-margin , margin]).domain(d3.extent(data, (d,i) => d.Petal_Length)) 
            var color = d3.scaleOrdinal()
                .domain(["setosa", "versicolor", "virginica" ])
                .range([ "#F8766D", "#00BA38", "#619CFF"])
            var svg = d3.select("div#plot1").append("svg")
                .attr("width", width)
                .attr("height", height)
 
            svg.selectAll("circle")
                .data(data)
                .join("circle")
                .attr("cx", d => xScale(d.Sepal_Length))
                .attr("cy", d => yScale(d.Petal_Length))
                .attr("r", 5)
                .style("stroke", "darkgrey" )
                .style("stroke-width", 1) 
                .style("fill", function (d) { return color(d.Species) });
            
        })
        .catch(function(error){
            console.log(error)
        })
    
</script>
#look at https://github.com/holtzy/D3-graph-gallery/tree/master/DATA
#change to https://raw.githubusercontent.com/holtzy/D3-graph-gallery/master/DATA/data.xx

## <u>Overview</u>

This open educational resource (OER) is intended as the basis for an undergraduate course (14 week) where the students have already been exposed to at least one semester of a high-level programming language such as [Python](https://www.python.org/).  The intended student does not have to be familiar with web programming standards: [HTML]( https://developer.mozilla.org/en-US/docs/Web/HTML), [CSS]( https://developer.mozilla.org/en-US/docs/Web/CSS) and [JavaScript]( https://developer.mozilla.org/en-US/docs/Web/JavaScript) since we review those aspects initially.  We present the material as a series of [jupyter notebooks](https://jupyter.org/) intended to be run by the student using a standard python3 kernel attached to a scientific python distribution such as [Anaconda](https://www.anaconda.com/download/success). These notebooks contain cells which are markup (which is compatible with HTML5) or code (in one of the supported programming languages such as python). When hosted on a platform like google collaborate, they allow the user to view the text, click on [hyperlinks](https://en.wikipedia.org/wiki/Hyperlink) and also run some of the python code examples that illustrate the visualization and machine learning methods.  If using a platform such as [GitHub](https://docs.github.com/en/pages), the rendered jupyter notebooks will not have a running kernel attached and will therefore not execute python code.  The foundational components for practice will be client-side web applications consisting of a folder which HTML, CSS, JS and data files inside (such as [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) and [JSON](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON) formats).  We include appendix sections that have some details in [setting up a development environment]() and for [hosting a finished visualization on github.io]().

We start the course with a visualization and interaction overview which is intended to facilitate exploration and communication of complex data.  This is followed by a primer on web programming which details the [DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model), some background on styling with respect to visual components and finally data-centric programming patterns and exercises that are useful for working with data using JavaScript.  Because we are interested in learning about data in this course, we discuss data cleaning (mining and preprocessing) and machine learning (imputation and clustering) methods using the python programming language (which is more robust and simpler for these tasks).  We spend time to discuss and explore the color and composition elements of perceptual theory which will allow the student to better conceptualize why certain visual methods better comunicate certain types of data to the audience.  This is followed by a section on spatial visualization (maps and Geographic information) and some coverage of clustering and dimension reduction methods.

Visualization in this course is the process of taking data (in a simple table or more complex structure) and constructing a mapping of one or more columns (or fields or variables) to a visualization scale such as a position axis, color gradient, geometric parameter (like size or circle radius).



<img src="figures/visualization_overview_figure.png" alt="visualization_overview" width="700px">

## <u>Sections</u>

[Interaction and visualization overview](01_Overview.ipynb) - This overview page (with appendix for setup of IDE...)

[HTML/CSS/JS introduction](02_HTML_CSS_JS.ipynb) - cross device, standard interactions

[JS patterns for working with data](03_JS_Data_Patterns.ipynb) - looping over data, checking results and debugging (JSON)

[Data Loading Patterns](04_Loading_Data.ipynb) - using JS and d3 to manipulate HTML/CSS dynamically, using d3-fetch and filtering.

[Transforming data sources with d3](05_Data_Transforms.ipynb) - using d3-scale to create domain/range maps

[Data Interaction](06_Data_Interaction.ipynb) - examples of common interactions with the mouse and keyboard

[Data cleaning and normalization](07_Data_Cleaning.ipynb) - how to remove problem data and make comparisons

[Imputation and data missingess]() - why does data have missing values and can they be estimated

[Color and Composition]() - what human perceptual elements have to do with visualization

[Space and time part 1, with leaflet]() - spatial visualization is more common than you think

[Space and time part 2, with leaflet]() - spatial information joined to visuals

[Clustering Linear]() - principal component analysis (linear combinations of the variables)

[Clustering non-Linear]() - tSNE and UMAP for non-linear search for structure

[Dimension Reduction]() - reducing dimensionality using UMAP and limits

## <u>Appendix</u>

### <u>A1: Parts of a web app (runs in a browser)</u>

Web appications are interactive programs that make use of [HTML](https://developer.mozilla.org/en-US/docs/Web/HTML), [CSS](https://developer.mozilla.org/en-US/docs/Web/CSS) and [Javascript](https://developer.mozilla.org/en-US/docs/Web/JavaScript).  The [HTML specification](https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements) allows for either inline CSS and Javascript via the [link](https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/link) and [script](https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/script) tags or the use of eternal files that are colocated in a web folder.  This web folder constitutes what is commonly known as a client-side web application and can be served using a static webserver server such a [NGINX](). This webserver serves the same content to browses that use the [GET](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Methods/GET) method reuest for its main index.html page and as this page is processes via the browser render machinery the CSS and JS files that are linked are retrieved and exicuted in order from top to bottom.

<img src="figures/web_folder.png" alt="web_folder" width="700px">

### <u>A2: Setting up a development environment</u>

To set up a proper development environment for data visualization the user will need a static webserver at minimum along with a web broswer for testing and editor for updating scripts and content.  The most straightforward system forr this for students that has minimal cost is currently [Intellijel Webstorm](https://www.jetbrains.com/webstorm/) since it includes a dedicated and simple to configure webserver that will enable the data loading using the asynchronous [fetch API](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch) we will use in this OER resource.

#### <u>Download and Run the Installer</u>
Navigate to the previously linked webstorm download page shown belwo and select on of the valid OS: Windows, Linu or Mac. For Mac system be sure to pick (Apple Silicon) if you have a newish system that has a M1,M2,M3,M4, etc processor (not an Intel i5, i7, etc which you pick Intel). Likewise some newer Windows system also have ARM based chips and will need to use the Windows ARM download exe. Linux also has support for ARM or Intel.  Once you run the installer and it finishes without errors you can start to configure your first project. The images below are shown from MacOS using the Apple Silicon dmg download option.

<img src="figures/webstorm_download.png" alt="webstorm_download" width="600px">

<img src="figures/webstorm_installer.png" alt="webstorm_installer" width="700px">


#### <u>First Project Setup</u>
Using the menu, select new project and then a wizard will open shown below. Here you will enter the path in which the web folder (also a project folder in webstorm) will be created for you.

<img src="figures/webstorm_new_project.png" alt="webstorm_new_project" width="400px">

Next you will create the main index.html file that will get loaded into the web browser when the page is served by the webserver as whon below:

<img src="figures/webstorm_new_index.png" alt="webstorm_new_index" width="700px">

index.html is a standard starting page and has been around and in use for a long time but is not required if the webserver configuration is changed to point to a different page as the default.

Finally we will enter some HTML markup and then run the result by clicking on the browser launcher in the right corner of the index.html editor as shown below:

<img src="figures/webstorm_edit_index.png" alt="webstorm_edit_index" width="700px">

This should launch the firefox browser using a new webserver (with a non-standard port 63342 in this example)

<img src="figures/webstorm_browser_index.png" alt="webstorm_browser_index" width="500px">

Next we will check to make such the Javascript is also working by adding the HTML script tag as shown here:

<img src="figures/webstorm_script_src.png" alt="webstorm_script_src" width="700px">

And then add the following code to the main.js file:

<img src="figures/webstorm_main_js.png" alt="webstorm_main_js" width="700px">

Which renders the same as our orginal HTML test page:

<img src="figures/webstorm_browser_index.png" alt="webstorm_browser_index" width="500px">

These tests indicate eveything is setup correctly and that the system will work well with the OER materials.

#### Setting up a D3 webapp template

(1) Starting with the test webapp made above a D3 webapp can be made by adding a few lines of code to the index.html head tag which will then look like this:

(2) Next we will add a few lines to main.css to make sure it is loading into the browser session which will make our page a cream-colored off white if it works instead of the normal bright white:

(3) Finally we add some lines of code to the main.js file to ensure the D3 library is loaded and will be available for programming in the main.js file (you will use the d3 variable):

If this works you will have a page that looks like this:

<img src="figures/d3_template_webapp_index.png" alt="d3_template_webapp_index" width="500px">

You can download this template from the github repository under: [d3_template_webapp](https://github.com/timothyjamesbecker/Interactive_Data_Visualization/tree/main/d3_template_webapp)

### <u> A3: Hosting a finished visualization on github.io</u>

A new free account is required to make use of github.io hosting of client-side web applications. So navigating to github.com and using the sinup button will intiate that process:

<img src="figures/github_main.png" alt="webstorm_browser_index" width="700px">

If users have google for email service you can also use that to authenticate:

<img src="figures/github_signup.png" alt="webstorm_browser_index" width="700px">

Then you typically have to answer some security challenge in the email that gtihub.com will send you.  More detailed instructions are avaible of the main site [here](https://docs.github.com/en/get-started/start-your-journey/creating-an-account-on-github).

It is then recomended to become comfortable creating a new repository, cloning it and then commiting and pushing new materrial to it. Webstorm has github integration and can perform those operations once configured. Altenatively [github desktop](https://desktop.github.com/download/) can be useful.  The following tutorial can be helful for these tasks: [new project guide](https://docs.github.com/en/get-started/start-your-journey/hello-world).