# Google Colab, Jupyter Notebooks, Fundamentals of R and Python, and Galaxy
**Instructor:** Nuno S. Osório 
**Course:** PhD4MOZ

## Class Overview
This class introduces Google Colab, Jupyter Notebooks, the basics of R and Python, and an introduction to the Galaxy platform for bioinformatics analysis.

### Learning Objectives:
1. Understand the basic interface and functionality of Google Colab and Jupyter Notebooks.
2. Explore and compare basic operations in Python and R.
3. Learn about Galaxy and its use in bioinformatics workflows.

## Section 1: Introduction to Google Colab and Jupyter Notebooks

### What is Google Colab?
Google Colab is a free cloud-based service that lets you write and execute Python code in a Jupyter Notebook environment. It's a great tool for data science, machine learning, and research.

### What is Jupyter Notebook?
Jupyter Notebooks are open-source web applications that allow for live code, visualizations, and explanatory text all in one place.

In [None]:
# Let's start with a simple Python example in Google Colab
print("Hello, PhD4MOZ class!")

x = 10
y = 20
print(f"The sum of {x} and {y} is: {x + y}")

### Exercise 1: Python Practice in Colab
Write a Python program in Colab that calculates the area of a circle given its radius. Use the formula: $\text{area} = \pi r^2$. (You can use `math.pi` for the value of $\pi$.)

## Section 2: Fundamentals of R and Python
R and Python are two of the most popular programming languages for data science and scientific research. Although they differ in syntax and some capabilities, they share common principles when it comes to handling data and performing calculations.

### Example: DataFrames in Python and R
DataFrames are used to store tabular data. Here is how you can create a DataFrame in Python using the `pandas` library.

In [None]:
# Python example: Creating a DataFrame
import pandas as pd

data = {'Name': ['John', 'Jane', 'Alice', 'Bob'],
        'Age': [23, 25, 30, 35]}
df_python = pd.DataFrame(data)
print("Python DataFrame:")
print(df_python)

In R, the equivalent would be:
```r
data <- data.frame(Name = c('John', 'Jane', 'Alice', 'Bob'), Age = c(23, 25, 30, 35))
print(data)
```
You can try this code in RStudio.

### Exercise 2: Comparing DataFrames in Python and R
Create a DataFrame that stores information about students (Name, Age, and Grade) in both Python (Colab) and R (RStudio). Compare the syntax and functionality.

## Section 3: Basic Operations and Functions
Both R and Python share similarities when it comes to basic arithmetic operations, loops, and conditionals.

In [None]:
# Python example: Basic conditional statement
x = 15
if x > 10:
    print(f"{x} is greater than 10")
else:
    print(f"{x} is not greater than 10")

In R, the equivalent would be:
```r
x <- 15
if (x > 10) {
    print(paste(x, "is greater than 10"))
} else {
    print(paste(x, "is not greater than 10"))
}
```
Try this code in RStudio.

### Exercise 3: Checking Even or Odd
Write a program in both Python (Colab) and R (RStudio) that takes a number as input from the user, checks if it's even or odd, and prints the result.

## Section 4: Visualization in Python and R
Both Python and R offer powerful libraries for data visualization. In Python, `matplotlib` is commonly used, while in R, `ggplot2` is very popular.

In [None]:
# Python example: Plotting with matplotlib
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 30, 35]
plt.plot(x, y)
plt.title("Simple Line Plot")
plt.xlabel("X axis")
plt.ylabel("Y axis")
plt.show()

In R, you can use `ggplot2` for visualizations:
```r
library(ggplot2)
ggplot(data, aes(x=Name, y=Age)) + geom_bar(stat="identity") + ggtitle("Bar Plot of Age by Name")
```
Try this in RStudio.

### Exercise 4: Visualization
Create a plot that shows the growth of a dataset over time (e.g., monthly sales) in both Python and R. Compare the plotting libraries and their ease of use.

## Section 5: Introduction to Galaxy
Galaxy is an open-source platform for bioinformatics research, providing accessible, reproducible, and transparent computational analyses. You can perform complex workflows like genome analysis, RNA-seq, and much more without needing deep programming knowledge.

### Getting Started with Galaxy
- Visit [usegalaxy.org](https://usegalaxy.org/)
- Create an account and explore its tools and workflows.
- You can upload your own data or use public datasets to run analysis pipelines such as mapping, assembly, and variant calling.

### Exercise 5: Bioinformatics with Galaxy
Explore Galaxy and try running a basic workflow such as read alignment or variant calling. Compare the ease of use of Galaxy's graphical interface with scripting in Python or R for similar bioinformatics tasks.

## Homework Assignment
Pick a dataset of your choice (from Kaggle, Galaxy, or another source) and analyze it using Python in Google Colab. Then, repeat the analysis using R in RStudio or a similar platform. Prepare a short report comparing the two workflows, highlighting what you found easier or more challenging in each environment.