# Analyzing and Visualizing Penguin Data with ggplot2
### Data Analysis and Visualization using the **palmerpenguins** dataset in R

In this notebook, I leverage the data analysis and visualization skills learned in my **Data Analysis in R** course at **Eastern University**. Using **ggplot2**, I explore and uncover insights from the **palmerpenguins** dataset through compelling visualizations.

### Setup and library imports

In [5]:
# install necessary libraries
install.packages(c("ggplot2", "palmerpenguins", "dplyr"))


The downloaded binary packages are in
	/var/folders/pl/77yqs8ws78z445jd6qd59vtr0000gn/T//RtmpgigY5o/downloaded_packages


In [9]:
# load libraries
library(palmerpenguins)
library(ggplot2)
library(dplyr)

### Dataset Overview
A brief overview of the dataset, including summary statistics and an inspection of the columns and first few rows.

In [18]:
message("glimpse:")
glimpse(penguins)

message("summary:")
summary(penguins)

message("structure:")
str(penguins)

message("first five rows:")
head(penguins, 5)

glimpse:



Rows: 344
Columns: 8
$ species           [3m[90m<fct>[39m[23m Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adel…
$ island            [3m[90m<fct>[39m[23m Torgersen, Torgersen, Torgersen, Torgersen, Torgerse…
$ bill_length_mm    [3m[90m<dbl>[39m[23m 39.1, 39.5, 40.3, [31mNA[39m, 36.7, 39.3, 38.9, 39.2, 34.1, …
$ bill_depth_mm     [3m[90m<dbl>[39m[23m 18.7, 17.4, 18.0, [31mNA[39m, 19.3, 20.6, 17.8, 19.6, 18.1, …
$ flipper_length_mm [3m[90m<int>[39m[23m 181, 186, 195, [31mNA[39m, 193, 190, 181, 195, 193, 190, 186…
$ body_mass_g       [3m[90m<int>[39m[23m 3750, 3800, 3250, [31mNA[39m, 3450, 3650, 3625, 4675, 3475, …
$ sex               [3m[90m<fct>[39m[23m male, female, female, [31mNA[39m, female, male, female, male…
$ year              [3m[90m<int>[39m[23m 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…


summary:



      species          island    bill_length_mm  bill_depth_mm  
 Adelie   :152   Biscoe   :168   Min.   :32.10   Min.   :13.10  
 Chinstrap: 68   Dream    :124   1st Qu.:39.23   1st Qu.:15.60  
 Gentoo   :124   Torgersen: 52   Median :44.45   Median :17.30  
                                 Mean   :43.92   Mean   :17.15  
                                 3rd Qu.:48.50   3rd Qu.:18.70  
                                 Max.   :59.60   Max.   :21.50  
                                 NA's   :2       NA's   :2      
 flipper_length_mm  body_mass_g       sex           year     
 Min.   :172.0     Min.   :2700   female:165   Min.   :2007  
 1st Qu.:190.0     1st Qu.:3550   male  :168   1st Qu.:2007  
 Median :197.0     Median :4050   NA's  : 11   Median :2008  
 Mean   :200.9     Mean   :4202                Mean   :2008  
 3rd Qu.:213.0     3rd Qu.:4750                3rd Qu.:2009  
 Max.   :231.0     Max.   :6300                Max.   :2009  
 NA's   :2         NA's   :2                  

structure:



tibble [344 × 8] (S3: tbl_df/tbl/data.frame)
 $ species          : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ island           : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ bill_length_mm   : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
 $ bill_depth_mm    : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
 $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
 $ body_mass_g      : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
 $ sex              : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
 $ year             : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...


first five rows:



species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,year
<fct>,<fct>,<dbl>,<dbl>,<int>,<int>,<fct>,<int>
Adelie,Torgersen,39.1,18.7,181.0,3750.0,male,2007
Adelie,Torgersen,39.5,17.4,186.0,3800.0,female,2007
Adelie,Torgersen,40.3,18.0,195.0,3250.0,female,2007
Adelie,Torgersen,,,,,,2007
Adelie,Torgersen,36.7,19.3,193.0,3450.0,female,2007
