# Week 1
source : https://labs.cognitiveclass.ai/tools/jupyterlab/lab/tree/labs/DV0101EN/DV0101EN-1-1-1-Introduction-to-Matplotlib-and-Line-Plots-py-v2.0.ipynb

Purpose of data visualization: Data understanding, Reporting. Just remember, less is more attractive, less is more impactive and less is more effective. 

## Matplotlib

Matplotlib has 3 layers

1. Scripting layer: It is the appropriate layer for everyday purposes and is considered a lighter scripting interface to simplify common tasks and for a quick and easy generation of graphics and plots.
2. Artist layer : Here much of the heavy lifting happens and is usually the appropriate programming paradigm when writing a web application server, or a UI application, or perhaps a script to be shared with other developers
3. Backend Layer : It contains the backend stuffs.

### Backend Layer

It has 3 built in abstract interface classes. 
- FigureCanvas : matplotlib.backend_bases.FigureCanvas : It encompasses the area onto which the figure is done. 
- Renderer : matplotlib.backend_bases.Renderer : Knows how to draws on the figure canvas.
- Event : matplotlib.backend_bases.Event : Handles user inputs such as keyboard strokes and mouse clicks.

### Artist Layer

It comprised of one main object - Artist. The artist knows how to use the renderer to draw on the canvas. Titles, lines, tick labels, and images all corresponds to individual artist instances. There are 2 types of artist instances:
- Primitive: Line2d, rectangle, circle and text
- Composite: Axis, tick, axes and figure.
Each composite artist may contain other composite artists as well as primitive artists. 

The top-level Matplotlib object that contains and
manages all of the elements in a given graphic is the figure artist, and the
most important composite artist is the axes because it is where most of the
Matplotlib API plotting methods are defined, including methods to create and
manipulate the ticks, the axis lines, the grid or the plot background.


In [11]:
# putting the artst layer in use
from matplotlib.backends.backend_agg import FigureCanvasAgg, FigureCanvas
from matplotlib.figure import Figure
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

fig = Figure()
canvas = FigureCanvas(fig)
# agg stands for anti grain geometry, which is a high-performance library that produces attractive images

x = np.random.randn(10000)
ax = fig.add_subplot(111) # 111 means 1 row, 1 col and use 1st element

ax.hist(x, 100)
ax.set_title("figure")
fig.savefig('pic.png')

### Scripting layer
It is developed by scientists not by professional developers. It is mainly the pyplot module.

%matplotlib notebook 

With the notebook backend in place, if a plt function is called, it checks if an active figure exists, and any functions you call will be applied to this active figure. If a figure does not exist, it renders a new figure.

### Line plots

- A Line plot is a type of plot which display information as series of data points called markers connected by strainght line segment. 
- The best use case for a line plot is when you have a continuous dataset and you're interested in visualizing the data over a period of time.

df.plot(kind='line') :  to plot line plots

# Week 2
source : https://labs.cognitiveclass.ai/tools/jupyterlab/lab/tree/labs/DV0101EN/DV0101EN-2-2-1-Area-Plots-Histograms-and-Bar-Charts-py-v2.0.ipynb

### Area Plot
It is also known as area chart or area graph. Commonly represented cumulated totals using numbers or percentages over time. It is based on line plots and commonly used when we try to compare 2 or more quantities.

        df.plot(kind='area')
        plt.show()
        
### Histogram
A histogram is a way of representing frequency distribution of a variable. To plot the histogram:

        count, bin_edges = np.histogram(df[col]) # To create adjusted xticks
        df['col'].plot(kind='hist', xticks=bin_edges)
        plt.show()

### bar plot
Unlike histogram, a bar chart is commonly used to compare the values of a variable at a given point in time. 

        df['col'].plot(kind='bar')
        plt.show()
        

        

# Week 2 - II
source : https://labs.cognitiveclass.ai/tools/jupyterlab/lab/tree/labs/DV0101EN/DV0101EN-2-3-1-Pie-Charts-Box-Plots-Scatter-Plots-and-Bubble-Plots-py-v2.0.ipynb

### Pie Chart
A pie chart is a circular statistical graphic divided into slices to illustrate numerical proportion. Bar chart is a better alternative.

    df.plot(kind='pie')
    plt.show()
    
### Box Plot
A box plot is a way of statistically representing the distribution of given data through five main dimensions. First is minimum, 25%, median, 75%, maximum.

    df.plot(kind='box')
    plt.show()
    
### Scatter Plot
A scatter plot is a type of plot that displays values pertaining to typically two variables against each other. Generally, it is done with dependent and independent variable. 

    df.plot(kind='scatter', x='col1', y='col2')
    plt.show()

# Week 3
source : https://labs.cognitiveclass.ai/tools/jupyterlab/lab/tree/labs/DV0101EN/DV0101EN-3-5-1-Generating-Maps-in-Python-py-v2.0.ipynb

### Wafle Charts
A waffle chart is an interesting visualization that is normally created to display progress towards goals. For eg: daily contribution graph in github or kaggle. Matplotlib doesn't have a function to create waffle chart.

### Word Cloud
 A word cloud is simply a depiction of the importance of different words in the body of text. A word cloud works in a simple way; the more a specific word appears in a source of textual data the bigger and bolder it appears in the world cloud. Matplotlib doesn't have a function to create word cloud.
 
### Regression Plot : Seaborn
Seaborn is based on matplotlib. Created to provide high lable graphics. Very less code to create visualizations using seaborn. we can optimize marker and color by using respective arguments.

    sns.regplot(x='col1', y='col2', data=df)
    plt.show()
    
### Introduction to folium
Folium is a powerful data visualization library in Python that was built primarily to help people visualize geospatial data. With Folium, you can create a map of any location in the world as long as you know its latitude and longitude values. It is interactive.

        world_map = folium.Map() # create the map
        world_map # show the map
        
        # map of canada
        world_map = folium.Map(location = [56.130, -106.35], zoom_start=4)
        world_map
        
        # using tiles param we can create various styles of map : Stamen Toner, Stamen Terrain
        
- We can add a marker at a point by creating feature group. 

        center = folium.map.FeatureGroup()
        center.add_child(folium.features.CircleMarker([51.02, -103.26], radius=5, color='red', fill_color='red')
        world_map.add_child(center)
        folium.marker([51.02, -103.26], popup = 'center').add_to(world_map)
        world_map
        
### Choropleth Maps
choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per capita income. The higher the measurement, the darker the color. We need Geojson file for creating the geospatial data. A GeoJson file is a json file which contains values like country name, type, coordinates and other information. 

        world_map = folium.Map(zoom_start=3, tiles='MapBox Bright')
        geojson = r'World_countries.json'
        world_map.choropleth(geo_path=geojson, data= df,columns=['country', 'total'], key_on=feature.properties.name,
                                fill_color='red', legend_name='Immigration')
        world_map