Skip to content

A generic model to visualize different datasets with ease

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



15 Commits

Repository files navigation


A generic model to visualize different datasets with ease

1-D Visualization

Overall Stats of the Dataset

Function: SubsetAttributes(dataf, attributes)


dataf - the dataframe that holds the dataset

attributes - the attributes/features for which you want to print stats

  • Print count, mean, std, min, quartiles, max for the attributes defined
  • Display the table

alt text

Histogram - all attributes

Function: Plot_Histograms(dataf)


dataf - the dataframe that holds the dataset

  • Plot histograms for all attributes in the dataset

alt text

Histogram Plot for a single(numeric) attribute

Funtcion: Plot_Single_Attribute_Hist_Density_Plot(dataf, attribute)


dataf - the dataframe that holds the dataset

attribute - the 'numeric' attribute for which you want to plot the histogram and density plot

  • Plot histogram and density plot for a single attribute
  • For numeric attribute

alt text

Histogram Plot for a single(categorical) attribute

Function: Plot_Single_Categorical_Attribute(dataf, attribute)


dataf - the dataframe that holds the dataset

attribute - the 'categorical' attribute for which you want to plot the histogram and density plot

  • Plot histogram and density plot for a single attribute
  • For categorical attribute
  • Visualize Categorical Data - Visualizing a discrete, categorical data attribute is slightly different and bar plots are good for this

alt text

2-D Visualization

Correlation Matrix - Heat Map

Function: Correlation_Matrix_HeatMap(dataf)


dataf - the dataframe that holds the entire dataset

  • Plotting Correlation between different attributes
  • Correlation Matrix Heatmap - check out potential relationships or correlations amongst the different data attributes by leveraging a pair-wise correlation matrix and depicting it as a heatmap
  • The gradients in the heatmap vary based on the strength of the correlation

alt text

Pair-Wise Scatter Plot

Function: Pair_Wise_Scatter_Plot(dataf, attribute_list)


dataf - the dataframe that holds the entire dataset

attribute_list - the list of attributes for which you want to plot pair-wise scatter plots

  • Visualizing two-dimensional data with pair-wise scatter plots
  • Pair-wise Scatter Plots - Depicting correlation amongst different attributes of the data
  • Observe patterns in two-dimensions for data attributes

alt text

Parallel Coordinates

Function: Parallel_Coordinates(dataf, attribute_list, category_types)


dataf - the dataframe that holds the dataset

attribute_list - the list of attributes for which you want to plot parallel coordinates

  • Points are represented as connected line segments
  • Each vertical line represents one data attribute
  • One complete set of connected line segments across all the attributes represents one data point
  • Points that tend to cluster will appear closer together

alt text

Scatter Joint Plot

Function: Plot_Scatter_Joint(dataf, attributes)


dataf - the dataframe that holds the dataset

attributes - the list of the two attributes for which you want to plot 'scatter plot' and 'joint plot'

  • Check for patterns, relationships between the two attributes
  • See the individual distributions for the attributes

alt text

alt text

Categorical Bar Plot - 2 Categories

Function: TwoD_Categorical_Bar_Plots(dataf_class1, dataf_class2, attribute, class_1_label, class_2_label)


dataf_class1 - dataframe with all attributes for class 1

dataf_class2 - dataframe with all attributes for class 2

attribute - attribute that we want to compare for both classes

class_1_label - title/label for class 1 bar plot

class_2_label - title/label for class 2 bar plot

  • Separate plots (subplots) or facets for one of the categorical dimensions
  • Using subplots or facets along with Bar Plots

alt text

Categorical Bar Plot - 3 Categories

Function: def Categorical_Bar_Plots_3Class_2D(dataf_class1, dataf_class2, dataf_class3, attribute, class_1_label, class_2_label, class_3_label)


dataf_class1 - dataframe with all attributes for class 1

dataf_class2 - dataframe with all attributes for class 2

dataf_class3 - dataframe with all attributes for class 2

attribute - attribute that we want to compare for both classes

class_1_label - title/label for class 1 bar plot

class_2_label - title/label for class 2 bar plot

class_3_label - title/label for class 2 bar plot

  • Separate plots (subplots) or facets for one of the categorical attributes
  • Using subplots or facets along with Bar Plots

alt text

Histogram and Density Plots for mixed(numeric+categorical) attributes - 2 classes

Function: Mixed_Attributes_Hist_Density_Plot(dataf_class1, dataf_class2, attribute, class_1_label, class_2_label)


dataf_class1 - dataframe with all attributes for class 1

dataf_class2 - dataframe with all attributes for class 2

attribute - attribute that we want to compare for both classes

class_1_label - title/label for class 1 bar plot

class_2_label - title/label for class 2 bar plot

  • Facets with histograms
  • Visualizing mixed attributes in two-dimensions (essentially numeric and categorical together)
  • Faceting\subplots along with generic histograms or density plots

alt text

alt text

Histogram and Density Plots for mixed(numeric+categorical) attributes - 3 classes

Function: Mixed_Attributes_Hist_Density_Plot_3_Classes(dataf_class1, dataf_class2, dataf_class3, attribute, class_1_label, class_2_label, class_3_label)


dataf_class1 - dataframe with all attributes for class 1

dataf_class2 - dataframe with all attributes for class 2

dataf_class3 - dataframe with all attributes for class 3

attribute - attribute that we want to compare for both classes

class_1_label - title/label for class 1 bar plot

class_2_label - title/label for class 2 bar plot

class_3_label - title/label for class 3 bar plot

  • Facets with histograms
  • Visualizing mixed attributes in two-dimensions (essentially numeric and categorical together)
  • Faceting\subplots along with generic histograms or density plots

alt text

alt text

Box Plots

Function: Box_Plots(dataf, attribute_x, attribute_y)


dataf - the dataframe holding the entire dataset

attribute_x - the attribute to be plot across x_axis

attribute_y - the attribute on the y-axis

  • Box Plots - an effective representation of two-dimensional mixed attributes
  • Effectively depicting groups of numeric data based on the different values in the categorical attribute
  • A good way to know the quartile values in the data and also potential outliers

alt text

Violin Plots

Function: Violin_Plots(dataf, attribute_x, attribute_y)


dataf - the dataframe holding the entire dataset

attribute_x - the attribute to be plot across x_axis

attribute_y - the attribute on the y-axis

  • Violin Plots - an effective representation of two-dimensional mixed attributes
  • Another effective way to visualize grouped numeric data using kernel density plots
  • Depicts probability density of the data at different values

alt text

3-D Visualization

Pair-Wise Scatter Plots

Function: Pair_Wise_Scatter_Plot_3D(dataf, attribute_list, category_column)


dataf - dataframe that holds the entire dataset

attribute_list - the list of attributes for which you want to plot 'Scatter Plots' in comparison

category_column - the column name in the dataframe, that holds the class labels

  • Scatter Plot with Hue for visualizing data in 3-D
  • Visualizing three-dimensional data with scatter plots and hue (color)
  • Check out correlations and patterns and also compare around class groups

alt text

Numeric Data Plot

Funtion: Numeric_Data_3D(dataf, attribute_x, attribute_y, attribute_z)


dataf - the dataframe that holds the entire dataset

attribute_x - the attribute on the x-axis

attribute_y - the attribute on the y-axis

attribute_z - the attribute on the z-axis

  • Visualizing 3-D numeric data with Scatter Plots
  • length, breadth and depth

alt text

Bubble Chart

Function: Bubble_Chart_3D(dataf, attribute_x, attribute_y, attribute_s)


dataf - the dataframe holding the entire dataset

attribute_x - the attribute to be plot across x_axis

attribute_y - the attribute on the y-axis

attribute_s - the attribute represented as size - The values in this atttribute must all be positive

  • Visualizing 3-D numeric data with a bubble chart
  • length, breadth and size
  • Leverage the regular 2-D axes & introduce the notion of size as the third dimension (essentially a bubble chart) where the size of the dots indicate the quantity of the third dimension.

alt text

Scatter Plot for mixed(numeric_categorical) attributes

Function: Scatter_Plot_3D_Mix_Data(dataf, attribute_num_x, attribute_num_y, attribute_cat)


dataf - the dataframe that holds the entire dataset

attribute_x - attribute on the x-axis

attribute_c - attribute separating bar plots - for each value of this categorical attribute a different bar plot will be plotted

attribute_cat - the categorical attribute - you can use the column name in dataframe holding the class labels - this will act as the third dimension

Note: mixed attributes - categorical plus numeric - 2 numberic and 1 categorical

  • Visualizing 3-D mix data using scatter plots
  • leveraging the concepts of hue for categorical dimension
  • Visualizing mixed attributes in three-dimensions leveraging scatter plots and the concept of hue
  • Visualization for three mixed attributes, using the notion of hue for separating our groups in one of the categorical attributes while using conventional visualizations like scatter plots for visualizing two dimensions for numeric attributes

alt text

alt text

Kernel Density Plot

Function: Kernel_Density_Plot_3D(dataf_class1, dataf_class2, attribute_x, attribute_y)


dataf_class1 - dataframe with all attributes for class 1

dataf_class2 - dataframe with all attributes for class 2

attribute_x - the attribute to place on x-axis

attribute_y - the attribute to place on y-axis

  • Visualizing 3-D mix data using kernel density plots
  • leveraging the concepts of hue for categorical dimension
  • see the density concentrations based on the hue intensity

alt text

4-D Visualization

Scatter Plot for mixed(numeric+categorical) attributes

Function: Scatter_Plot_4D_Mix(dataf, attribute_x, attribute_y, attribute_z, category_column)


dataf - the dataframe that hlds the dataset

attribute_x - the attribute to place on x-axis; numeric attribute

attribute_y - the attribute to place on y-axis; numeric attribute

attribute_z - the attribute to place on z-axis; numeric attribute

category_column - the column name in the dataframe, that holds the class labels

  • Visualizing 4-D mix data using scatter plots
  • Leveraging the concepts of hue and depth
  • Three numberic attributes and one categorical attribute

alt text

Bubble Chart

Function: Bubble_Plot_4D_Mix(dataf, attribute_x, attribute_y, attribute_z, category_column)


dataf - the dataframe that hlds the dataset

attribute_x - the attribute to place on x-axis; numeric attribute

attribute_y - the attribute to place on y-axis; numeric attribute

attribute_z - the attribute to represent as size; numeric attribute

category_column - the column name in the dataframe, that holds the class labels

Note : depending on the min and max value of attribute z, you can alter the value of 'scale' in the code to adjust the min and max size presented on graph

  • Visualizing 4-D mix data using bubble plots
  • Leveraging the concepts of hue and size
  • Visualizing data in four-dimensions leveraging bubble charts and the concept of hue and size
  • Using size to represent one of the numeric attributes
  • Using hue to represent the categorical attribute

alt text

5-D Visualization

Bubble Chart

Function: Bubble_Chart_5D_Mix(dataf, attribute_x, attribute_y, attribute_z, attribute_s, category_column)


dataf - the dataframe that hlds the dataset

attribute_x - the attribute to place on x-axis; numeric attribute

attribute_y - the attribute to place on y-axis; numeric attribute

attribute_z - the attribute to place on z-axis; numeric attribute

attribute_s - the attribute to represent as size; numeric attribute

category_column - the column name in the dataframe, that holds the class labels

  • Visualizing 5-D mix data using bubble charts
  • leveraging the concepts of hue, size and depth

alt text


A generic model to visualize different datasets with ease






No releases published


