Skip to content

The performance of Pymaceuticals' drug of interest, summary statistics table consisting of the Charts, mean, median, variance, standard deviation, and SEM of the tumor volume for each drug regimen.

Notifications You must be signed in to change notification settings

manishalal145/MatplotlibChallenege

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Matplotlib Homework - The Power of Plots

Laboratory image

Background

This respository apply a Python Matplotlib to visualize a real-world pharmaceutical data. The data is sourced from Pymaceuticals Inc., a burgeoning pharmaceutical company based out of San Diego. Pymaceuticals specializes in anti-cancer pharmaceuticals. In its most recent efforts, it began screening for potential treatments for squamous cell carcinoma (SCC), a commonly occurring form of skin cancer.

These analysis used a complete data from their most recent animal study in two datasets in CSV format. Data set one is Mouse_metadata.csv wich includes 249 mice identified data with SCC tumor growth were treated through a variety of drug regimens, and their Sex, Age_months and Weight (g) identified. The other dataset is Study_results.csv file which includes the results of the study in each columns Mouse I,Timepoint,Tumor Volume (mm3), and Metastatic Sites.

Purpose

The purpose of this study was to compare the performance of Pymaceuticals' drug of interest, Capomulin, versus the other treatment regimens. The analysis also generated all of the table and figures needed for the technical, and top-level summary report of the study. For this analysis both datasets imported, merged,cleaned and the aggregate data diplayed in to Python Pandas dataframes, visualized in Matplotlib, and other libraries used in order to make a stastical analysis. The project is conducted in Jupyter notebook to showcase, and communicate the analysis report.

Table of Contents

Data cleaning
summary statistics
Bar and Pie Charts
Quartiles, Outliers and Boxplots
Line and Scatter Plots
Correlation and Regression

Solutions

Data Cleaning

  • The data was loaded, read, combined, duplicate removed, and the head (5 rows on the top) of cleaned data out put looks as follows:
    cleaned data Out put cleaned data Out put

Summary statistics

  • A summary statistics table was generated by using two techniques one is by creating multiple series, and putting them all together at the end, and the other method produces everything in a single groupby function. The summary statistic table consis the mean, median, variance, standard deviation, and SEM of the tumor volume for each drug regimen.
    mean median data Out put

Bar and Pie Charts

  • Two identical bar charts was generated by using both Pandas's DataFrame.plot() and Matplotlib's pyplot that shows the number of total mice for each treatment regimen throughout the course of the study. mice per treat pyplot graph Out put

  • Two identical pie plot was generated by using both Pandas's DataFrame.plot() and Matplotlib's pyplot that shows the distribution of female or male mice in the study.
    gender pie pyplot graph Out put

Quartiles, Outliers and Boxplots

  • The final tumor volume of each mouse across four of the most promising treatment regimens was created: Capomulin, Ramicane, Infubinol, and Ceftamin. Afterward the quartiles, IQR, and potential outliers across all the four treatment regimens was quantitatively determined. Quartiles Out put

Box and Whisker Plot

  • A box and whisker plot of the final tumor volume for all four treatment regimens was generated, and a potential outliers highlighted by using color, and style.
    drug vs volume box plot graph Out put

Line Plots

  • A line plot created on selected mouse (r157) that was treated with Capomulin, and generate a line plot of time point versus tumor volume for that mouse.
    capomulin line graph Out put

Scatter Plot

  • A scatter plot of mouse weight versus average tumor volume for the Capomulin treatment regimen was created.
    capomulin regression Out put

Correlation and Regression

  • A correlation coefficient, and linear regression analysis was conducted between mouse weight and average tumor volume for the Capomulin treatment. A Plot of the linear regression model created on top of the previous scatter plot.

Correlation

The correlation between mouse weight and average tumor volume is 0.84

Regression

  • Adding a linear regression line to the scatter plot
    capomulin regression Out put

About

The performance of Pymaceuticals' drug of interest, summary statistics table consisting of the Charts, mean, median, variance, standard deviation, and SEM of the tumor volume for each drug regimen.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published