# Project in Data Sciences Course

Welcome to the "Project in Data Sciences" course! In this course, we will be working with par-seqFISH data in raw format to explore various aspects of data analysis and visualization using Python. The main goal of this course is to gain practical skills in analyzing and interpreting biological data to gain insights into gene expression levels and dynamics.

The course is divided into three parts:
### Part 1: Working with Images

In the first part of the course, we will learn how to handle and process images, specifically using the par-seqFISH data in the nd2 format. We will explore techniques for image manipulation, visualization, and analysis using Python libraries such as Napari, NumPy, and OpenCV.

### Part 2: Objects and Segmentation

In the second part of the course, we will delve into the concept of objects and segmentation in images. We will learn how to identify and extract objects of interest from the par-seqFISH images, and perform basic image segmentation techniques to separate cells into different conditions. We will also explore techniques for measuring various cellular properties such as size, DAPI, and rRNA distributions using image analysis.

### Part 3: Final Project

In the final part of the course, we will bring together the skills and knowledge gained from the previous parts to work on a final project. This project will involve analyzing gene expression levels and their dynamics in the par-seqFISH data, and applying various data analysis and visualization techniques to gain meaningful insights.


# Lesson 1 - Introduction to Images and Working with them in Python

Welcome! In this notebook, we will go through some basic image processing in Python and familiarize ourselves with different utilities that can be useful for any computer vision pipeline, utilities provides through libraries like numpy, napari, skimage, glob, tqdm and more.

### Working with Conda Environment and Installing libraries

To ensure a clean and isolated working environment, we will be using Conda, a popular package management system and environment manager for Python. Conda allows us to create and manage virtual environments with specific dependencies for our project, which can help avoid conflicts between different packages and ensure reproducibility.

Here are the steps to open a Conda environment:

1. Open Anaconda or Miniconda: If you have Anaconda or Miniconda installed on your machine, open the Anaconda Navigator or the Conda command prompt, respectively.

2. Create a New Conda Environment: In the Conda command prompt, run the following command to create a new environment with a specific Python version:

`conda create -n my_env python=3.8`

Replace "my_env" with the name you want to give to your environment, and "python=3.8" with the desired Python version.

3. In the Conda command prompt, run the following command to activate the environment:

`conda activate my_env`

4. Install libraries: With the Conda environment activated, we can now install the required packages. In the Conda command prompt, run the following commands to install NumPy, Napari, plt, and nd2:

`conda install numpy`

`conda install -c conda-forge napari`

`conda install matplotlib`

`pip install nd2`

5. Verify the Installation: To verify that NumPy and Napari are successfully installed, you can run the following Python code in your Jupyter notebook:

```python
import numpy as np
import napari
import nd2
import matplotlib
```
If there are no errors, you are all set to start working with images using NumPy and Napari in your Conda environment!

With NumPy and Napari installed, you now have the necessary tools to manipulate and analyze par-seqFISH images in Python. Let's move forward and explore the exciting world of image analysis and visualization in the next sections!

### Images as arrays 

Images are represented as numpy arrays of shape (height, width, channels).

![RGB image as a numpy array](asserts/image_as_array.png)

<div style="text-align: right"> Credit: <a href="https://e2eml.school/convert_rgb_to_grayscale.html">Brandon Rohrer’s Blog</a></div>


Multiple utilities/packages exist to read images from files in Python,
we will use `nd2.ND2File`.


If you look in the directory containing this notebook, you will find a folder called data which includes some nd2 files. 

Let's load one image.

In [11]:
import nd2

path=r'data/Count00000_Point0000_ChannelPHASE 60x-100x PH3,DAPI,A488,A555,A647_Seq0000.nd2'
img = nd2.ND2File(path)

### Extracting Image Data and Metadata

Now that we have loaded the image using the "nd2" library, we can extract the image data and metadata from the "img" object. 

**Image Data Extraction:**

To extract the image data as a NumPy array, we can use the "asarray()" method provided by the "nd2" library. The extracted image data will be a 3D NumPy array, where the first dimension represents the channel, and the second and third dimensions represent the row and column indices, respectively. Here's an example:

```python
# Extract the image data as a NumPy array
image = img.asarray()
```
The resulting "image" variable will contain the image data as a NumPy array, which can be further processed and analyzed using various NumPy and image processing functions.

**Metadata Extraction:**

The "nd2" library also provides access to the metadata associated with the image, which contains information about the acquisition settings, microscope parameters, and other relevant information. To extract the metadata, we can simply access the "metadata" attribute of the "img" object, like this:

```python
# Extract the metadata from the img object
metadata = img.metadata
```



In [None]:
# Extract the image data as a NumPy array

# Your answer #

### Uploading an Image and Exploring Image Properties

Now that we have imported the `nd2` package and loaded an image from the provided path, let's explore some properties of the image.

1. **Image Shape:** To find out the shape of the image, you can simply print the `shape` attribute of the `img` object, like this:

    ```python
    print("Image Shape:", img.shape)
    ```





In [15]:
#What is the shape of the image? (e.g., (height, width, channels))

# Your answer #

2. **Channel Ranges:** Next, let's determine the range of intensity values for each channel in the image. You can use the `min()` and `max()` functions along with the `channels` attribute of the `img` object to obtain the minimum and maximum intensity values for each channel, like this:

    ```python
    for channel in img.channels:
        print("Channel", channel, "Range: min =", img[channel].min(), ", max =", img[channel].max())
    ```





In [16]:
#What is the range of intensity values for each channel in the image? (e.g., Channel 1 Range: min = 0, max = 255)

# Your answer #


3. **Metadata Extraction:** Image metadata can provide valuable information about the acquisition settings, microscope parameters, and other experimental details. You can extract the metadata from the `img` object using the `metadata` attribute, like this:

    ```python
    image_metadata = img.metadata
    ```

    Please answer the following question: What information can you gather from the extracted metadata? (e.g., acquisition date and time, microscope model, objective lens used)

Take a moment to explore the uploaded image and answer the questions above. Understanding the properties and metadata of the image will help us in further analysis and interpretation of the data. Feel free to refer to the `nd2` documentation for more details on working with ND2 images.


In [18]:
#What information can you gather from the extracted metadata? (e.g., acquisition date and time, microscope model, objective lens used)

# Your answer #

### Displaying Image using Matplotlib

In addition to exploring the image properties, we can also visualize the image using the Matplotlib library. Matplotlib provides various functions to display images, such as `imshow()`.

To display the loaded image, you can use the following code:

```python
import matplotlib.pyplot as plt

# Display the first channel of the image
channel = img.channels[0] # Choose a channel to display
plt.imshow(img[channel], cmap='gray')
plt.title('Image Channel: ' + channel)
plt.colorbar()
plt.show()
```
This code snippet uses `imshow()` function to display the image data from a selected channel using a grayscale colormap. The `title()` function sets the title of the plot, and `colorbar()` function adds a colorbar to the plot for intensity scale.

You can also customize the display properties such as `colormap`, `colorbar`, and other visual settings as needed. Matplotlib provides extensive documentation for further customization options.

Please go ahead and use the above code to display the image, and experiment with different channels and visualization settings as needed. Visualizing the image can help you gain insights into the data and better understand the characteristics of the image.

Note: Make sure to run the `plt.show()` function to display the plot in the notebook.

In [None]:
# Your answer #

### Image Slicing and Cropping

Now that we have loaded the image, let's explore how to extract and manipulate regions of interest (ROI) from the image using image slicing and cropping.

Image Slicing:

Image slicing allows you to extract specific portions of the image by indexing the image data. You can use square brackets with colon (':') notation to specify the range of indices for each dimension. For example:

```python
# Extract a slice from the image
image_slice = img[channel][start_row:end_row, start_col:end_col]
```
Try experimenting with different values for start_row, end_row, start_col, and end_col to extract and crop different regions of the image. You can also visualize the extracted or cropped regions using Matplotlib to visually inspect the results.

Please go ahead and try slicing and cropping the image using the provided instructions, and feel free to ask any questions or seek clarification if needed.


In [None]:
# Your answer #