<span STYLE="font-size:150%"> 
    Plot population frequencies
</span>

Docker image: gnasello/datascience-env:2023-01-27 \
Latest update: 9 April 2023

# Load required packages and data

In [None]:
library(ggplot2)
library(ggpubr)

Clone the [r_utils library](https://github.com/gabnasello/r_utils.git) from GitHub in the same folder of this script. 

You can simply running the following command in a new terminal (open it from JupyterLab):

`git clone https://github.com/gabnasello/r_utils.git`

How to Reuse Functions That You Create In Scripts, [tutorial](https://www.earthdatascience.org/courses/earth-analytics/multispectral-remote-sensing-data/source-function-in-R/)

In [None]:
source("../r_utils/ggplot_utils.R")
source("../r_utils/stats_utils.R")

Set the parameters of the `cyto_plot()` function

In [None]:
label_text_size = 1.4; label_fill_alpha = 0.2; label_text_font = 1
axes_text_size = 1.4
axes_label_text_size = 1.6; title_text_size = 1.5; header_text_size = 1.3
legend_text_size = 1.7

## Load data

In [None]:
#create data frame with 0 rows and 5 columns
df <- data.frame(matrix(ncol = 5, nrow = 0))

#provide column names
colnames(df) <- c('name','Population','Parent','Frequency','group')

In [None]:
filetable <- 'data/Control_FoxP3_freq.csv'
group <- 'Control'


df_dataset <- read.csv(file=filetable)
df_dataset['group'] <- rep(group, nrow(df_dataset))

df_dataset

# Adding Rows in dataframes
# https://www.statmethods.net/management/merging.html
df <- rbind(df, df_dataset)

In [None]:
filetable <- 'data/RD1_FoxP3_freq.csv'
group <- 'RD1'


df_dataset <- read.csv(file=filetable)
df_dataset['group'] <- rep(group, length(df_dataset))

df_dataset

# Adding Rows in dataframes
# https://www.statmethods.net/management/merging.html
df <- rbind(df, df_dataset)

In [None]:
filetable <- 'data/RD3_FoxP3_freq.csv'
group <- 'RD3'


df_dataset <- read.csv(file=filetable)
df_dataset['group'] <- rep(group, length(df_dataset))

df_dataset

# Adding Rows in dataframes
# https://www.statmethods.net/management/merging.html
df <- rbind(df, df_dataset)

In [None]:
filetable <- 'data/RD5_FoxP3_freq.csv'
group <- 'RD5'


df_dataset <- read.csv(file=filetable)
df_dataset['group'] <- rep(group, length(df_dataset))

df_dataset

# Adding Rows in dataframes
# https://www.statmethods.net/management/merging.html
df <- rbind(df, df_dataset)

In [None]:
filetable <- 'data/AS_FoxP3_freq.csv'
group <- 'AS2863619'


df_dataset <- read.csv(file=filetable)
df_dataset['group'] <- rep(group, length(df_dataset))

df_dataset

# Adding Rows in dataframes
# https://www.statmethods.net/management/merging.html
df <- rbind(df, df_dataset)

## Show whole dataset

In [None]:
df

## Summarize the data

The function below will be used to calculate the mean and the standard deviation, for the variable of interest, in each group. See [tutorial](http://www.sthda.com/english/wiki/ggplot2-line-plot-quick-start-guide-r-software-and-data-visualization#line-graph-with-error-bars)

In [None]:
df1 <- data_summary(df, varname="Frequency", 
                    groupnames=c("group"))
df1

Sort by player with custom order, as described [here](https://www.statology.org/arrange-rows-r/)

In [None]:
df1<- df1 %>% arrange(factor(group, levels = c('Control', 'RD1', 'RD3', 'RD5', 'AS2863619')))
df1

# Plot data

In [None]:
p <- ggbarplot(df1, x = "group", y = "Frequency", fill = "group") + 
               geom_errorbar(aes(ymin=Frequency-sd, ymax=Frequency+sd), width=.4) +
     geom_point(data=df, aes(x=group, y=Frequency), stroke = 1, shape = 21, size = 2.25)

img <- ggplotMinAethetics(p, width=2.5, height=4,
                          plot.title=element_text(size = 13),
                          xlabel=' ', 
                          ylabel='FoxP3+ Frequency (%)', 
                          scale_fill='npg',
                          legend.position="none"
                          ) + 
       scale_y_continuous(expand=c(0,0), limits=c(0,65)) + 
       theme(axis.text.x = element_text(angle = 60, vjust = 1, hjust=1))

img