# Univariate Analysis

1. Continuous Data Plots:

    * Univariate Scatter Plot(w/ or w/o hue parameter)
        * plt.scatter()
        * sns.scatterplot()
        
        ![](https://miro.medium.com/max/372/1*9QNixJ5eRRe8NjAMjEYFjQ.png)
    
    * Line Plot(with markers) - *It is similar to a scatter plot except that the measurement points are ordered (typically by their x-axis value) and joined with straight line segments.*
        * plt.plot()
        * sns.lineplot()
        
        ![](https://miro.medium.com/max/460/1*3AATuqbjcp3xMiTkuWwNPw.png)
    
    * Strip plot - *The strip plot is similar to a scatter plot. It is often used along with other kinds of plots for better analysis. It is used to visualize the distribution of data points of the variable.* 
        * sns.stripplot(y = df['col1'])
        * sns.stripplot(x = df['col1'] ,y = df['col2'])
        
        ![](https://miro.medium.com/max/332/1*YysiqzshO8EH03-0mQbOYQ.png)
        ![](https://miro.medium.com/max/332/1*OvLxUGVF8HoIB9eN9_rrCw.png)
        
    * Swarm plot - *The swarm-plot, similar to a strip-plot, provides a visualization technique for univariate data to view the spread of values in a continuous variable. The only difference between the strip-plot and the swarm-plot is that the swarm-plot spreads out the data points of the variable automatically to avoid overlap and hence provides a better visual overview of the data.*
        * sns.swarmplot(x = df['col1'])
        * sns.swarmplot(x = df['col1'], y = df['col2'])
        
        ![](https://miro.medium.com/max/298/1*LogSEhTZuhuHgytpbP5tfg.png)
        ![](https://miro.medium.com/max/332/1*5NKuFx60PNUFPDKT9CN-GA.png)
        
    * Histogram - *Histograms are similar to bar charts which display the counts or relative frequencies of values falling in different class intervals or ranges. A histogram displays the shape and spread of continuous sample data.*
        * plt.hist(df['col'])
        * sns.histplot(df['col'], kde=False, color='black', bins=10)
        
        ![](https://miro.medium.com/max/726/1*XUZLqINB4fJskto9CVNQOw.png)
    
    * Desnity Plot/ KDE Plot - *A density plot is like a smoother version of a histogram. Generally, the kernel density estimate is used in density plots to show the probability density function of the variable.*
        * plt.plot(df['col'], kind = 'density')
        * sns.kdeplot(df['col'], shade=True)
        
        ![](https://miro.medium.com/max/356/1*inbubvWs2Wi6cizmLEyIvA.png)
        ![](https://miro.medium.com/max/324/1*wcr6xe6VSjqkbF1al9Oi9Q.png)
        
    * Box Plot - *A box-plot is a very useful and standardized way of displaying the distribution of data based on a five-number summary (minimum, first quartile, second quartile(median), third quartile, maximum). It helps in understanding these parameters of the distribution of data and is extremely helpful in detecting outliers.*
    
    ![](https://miro.medium.com/max/628/1*FPnhYs6cs3ipUKIZhl9caA.png)
    
        * plt.boxplot(df['col'])
        * sns.boxplot(df['col'])
        * sns.boxplot(x = 'variable', y = 'value', data=df) # for mutiple boxplots in one plot
        
    ![](https://miro.medium.com/max/298/1*aiGFPKBEHIQsm5VMN9yDHw.png)
    ![](https://miro.medium.com/max/546/1*0x_qAsQ0ZblqfTgXS9cErg.png)
    
    * Distplot - *The distplot() function of seaborn library was earlier mentioned under rug plot section. This function combines the matplotlib hist() function with the seaborn kdeplot() and rugplot() functions.*
        * sns.distplot(df['col'], rug=True)
        
        ![](https://miro.medium.com/max/380/1*7xqUshvBswGn88mOS9Mdmw.png)
    
    * Violin Plot - *The Violin plot is very much similar to a box plot, with the addition of a rotated kernel density plot on each side. It shows the distribution of quantitative data across several levels of one (or more) categorical variables such that those distributions can be compared.*
    
    ![](https://miro.medium.com/max/241/1*5TYiYasGcxFLK4ThlgIiXA.png)
    
        * plt.violinplot(df.values, showmedians=True)
        * sns.violinplot(df['col'], orient='vertical')
        * sns.violinplot(x = df['variety'], y = df['petal.width'], data = df) # for mutiple violinplots in one plot
        
    ![](https://miro.medium.com/max/332/1*kgzy9FEnfvjQiFUi8uq7Ew.png)
    ![](https://miro.medium.com/max/555/1*6BefklySIxTT4qfbtT1qTg.png)
        
        <br><br><br><br><br><br>
2. Continuous Data Plots:
    * Bar Chart/ Count plot - *The bar plot is a univariate data visualization plot on a two-dimensional axis. One axis is the category axis indicating the category, while the second axis is the value axis that shows the numeric value of that category, indicated by the length of the bar.*
        * df['variety'].value_countd().plot.bar()
        * sns.countplot(df['variety'])
        
        ![](https://miro.medium.com/max/315/1*3CfkP0AckHPAOjkXoxCYkw.png)
        ![](https://miro.medium.com/max/329/1*jgV7izuVqOW8cn87pjU3UQ.png)
        
    * Pie chart - *A pie chart is the most common way used to visualize the numerical proportion occupied by each of the categories.*
        * plt.pie(df['variety'].value_counts(), labels = ['A', 'B', 'C'])
        * plt.pie(df['variety'].value_counts(), labels = ['A', 'B', 'C'], autopct='%.3f') # to show percentages in each section of pie chart
        
        ![](https://miro.medium.com/max/341/1*9BYYMKkNbWQtYyMe0fO2DA.png)
        ![](https://miro.medium.com/max/316/1*FnySNKmCFf-SljzRVmI5HQ.png)