Skip to content

Commit

Permalink
Merge pull request WayScience#23 from jenna-tomkinson/nf1_dp_normaliz…
Browse files Browse the repository at this point in the history
…ation

deepprofiler project processing features
  • Loading branch information
jenna-tomkinson committed Jan 18, 2023
2 parents 1ae36c3 + 3d18b66 commit e71bb1f
Show file tree
Hide file tree
Showing 13 changed files with 2,561 additions and 10 deletions.
7 changes: 5 additions & 2 deletions 4_processing_features/4.extract_sc_features.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
#!/bin/bash
jupyter nbconvert --to python extract_single_cell_features.ipynb
python extract_single_cell_features.py
jupyter nbconvert --to python *.ipynb

python extract_sc_features_cp.py

python extract_sc_features_dp.py
2 changes: 1 addition & 1 deletion 4_processing_features/4.processing_features.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ dependencies:
- conda-forge::numpy=1.22
- conda-forge::scikit-learn
- pip:
- git+https://github.com/cytomining/pycytominer@e43be528d3ca6d77d2b1a347fe8e4131dfc65bf0
- git+https://github.com/cytomining/pycytominer@afac3ea16818ad25f37318ecd5c5090c0eff5806
17 changes: 10 additions & 7 deletions 4_processing_features/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
# 4. Processing Extracted Single Cell Features

In this module, we present our pipeline for processing outputted `.sqlite` file with single cell features from CellProfiler.
The processed features are saved into compressed `.csv.gz` for use during statistical analysis.
In this module, we present our pipeline for processing outputted `.sqlite` file with single cell features from CellProfiler (CP) and DeepProfiler (DP).

The processed CP features are saved into compressed `.csv.gz` and DP features are saved as `.npz` files for use during statistical analysis.

## Pycytominer

We use [Pycytominer](https://github.com/cytomining/pycytominer) to perform the aggregation, merging, and normalization of the NF1 single cell features.
We use [Pycytominer](https://github.com/cytomining/pycytominer) to perform the merging, normalization, and feature selection of the NF1 single cell features.

For more information regarding the functions that we used, please see [the documentation](https://pycytominer.readthedocs.io/en/latest/pycytominer.cyto_utils.html#pycytominer.cyto_utils.cells.SingleCells.merge_single_cells) from the Pycytominer team.

### Normalization

CellProfiler features can display a variety of distributions across cells.
CellProfiler and DeepProfiler features can display a variety of distributions across cells.
To facilitate analysis, we standardize all features (z-score) to the same scale.

---
Expand All @@ -27,15 +28,17 @@ Make sure you are in the `4_processing_features` directory before performing the
conda env create -f 4.processing_features.yml
```

## Step 2: Normalize Single Cell Features
## Step 2: Normalize and Feature Select Single Cell Features

### Step 2a: Set Up Paths

Within the [extract_single_cell_features.ipynb](4_processing_features/extract_single_cell_features.ipynb) notebook, you can chnage the paths to reflect the local paths or names for your machine (***IF* you changed anything from the original pipeline**) for the various parameters (e.g. CellProfiler directory, output directory, path to sqlite file, etc.)
Within the [extract_sc_features_cp.ipynb](4_processing_features/extract_sc_features_cp.ipynb) notebook, you can change the paths to reflect the local paths or names for your machine (***IF* you changed anything from the original pipeline**) for the various parameters (e.g. CellProfiler directory, output directory, path to sqlite file, etc.)

As well, you can update the paths with the [extract_sc_features_dp.ipynb](4_processing_features/extract_sc_features_dp.ipynb) notebook if the paths to the project are different on your local machine.

### Step 2b: Run Extract Single Cell Features

Using the code below, run the notebook to extract and normalize single cell features from CellProfiler.
Using the code below, run the notebook to extract and normalize single cell features from CellProfiler and DeepProfiler.

```bash
# Run this script in terminal
Expand Down
Binary file not shown.
Binary file not shown.
257 changes: 257 additions & 0 deletions 4_processing_features/data/nf1_sc_norm_deepprofiler_cyto.csv.gz

Large diffs are not rendered by default.

258 changes: 258 additions & 0 deletions 4_processing_features/data/nf1_sc_norm_deepprofiler_nuc.csv.gz

Large diffs are not rendered by default.

Binary file not shown.
Binary file not shown.
Loading

0 comments on commit e71bb1f

Please sign in to comment.