# Omics data


## omics Function

Function for processing omics data.

### Arguments

- `expid` (list): List of experiment IDs.
- `assembly` (list): List of genome assemblies.
- `assembly_threshold` (str): Assembly threshold.
- `antigen_class` (list): List of antigen classes.
- `antigen` (list): List of antigens.
- `cell_type` (list): List of cell type classes.
- `cell` (list): List of cell types.
- `output_path` (Path): Output path for the processed files.

### Returns

- `Path`: Path to the processed files.

### Usage

```python
result = OmicsDataCreate(expid=None, assembly=['hg38'], assembly_threshold='05', antigen_class=None,antigen=None, cell_type=None, cell=None, output_path=Path("./storage/"))


In [1]:
from OmicsDC import OmicsDataCreate


In [2]:
from pathlib import Path

Example = {
    "expid"     :   ["DRX001045"],
    "assembly"  :   ["mm9"],
    "Q-value"   :   "50",
    "a.g_class" :   ["Histone"],
    "antigen"   :   ["H3.3"],
    "cell_type" :   ["Muscle"],
    "cell"      :   ["C2C12"],
    "output"    :   Path("./storage/")
}

result = OmicsDataCreate(
    expid               = Example["expid"],
    assembly            = Example["assembly"],
    assembly_threshold  = Example["Q-value"],
    antigen_class       = Example["a.g_class"],
    antigen             = Example["antigen"],
    cell_type           = Example["cell_type"],
    cell                = Example["cell"],
    output_path         = Example["output"]
    )

File check of mm9 complete
Done. Created file ./storage/2023-12-12_04_38_33.tar.gz 


In [6]:
!tar -xvzf <insert here created file> -C ./storage

mm9/DRX001045_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX001047_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX001061_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX001063_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX020493_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX020502_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX021083_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/DRX021084_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/SRX9460293_mm9_Histone_H3.3_Muscle_C2C12.bed
mm9/SRX9460298_mm9_Histone_H3.3_Muscle_C2C12.bed


## Example: Concatenate Files by Antigen for Omics Data

This algorithm performs the following steps:

1. User ungz `result` file from `OmicsDC.OmicsDataCreate()`.

2. Import the following modules:


In [7]:
from OmicsDC import Matching_Experiments_DF
from OmicsDC.resources import ExperimentList
from OmicsDC import Matching_Experiments_Options
import subprocess



3. Retrieve the experiments list using create_matching_expirement_df(ExperimentList, Matching_Experiments_Options), where Matching_Experiments_Options is a dictionary with fields that can be used by create_matching_expirement_df, and ExperimentList is the path to the ExperimentList file.

In [8]:
Matching_Experiments_Options = {
                'id'              : None,
                'Genome assembly' : ['mm9'],
                'Antigen class'   : ['Histone'],
                'Antigen'         : ["H3.3"],
                'Cell type class' : ["Muscle"],
                'Cell type'       : ["C2C12"]
                }

Matching_Experiments = Matching_Experiments_DF(ExperimentList, Matching_Experiments_Options)



4. Concatenate the needed files using the cat command. Iterate over the unique values of Matching_Experiments['Antigen'] and execute the following command for each value:

In [9]:
for i in list(Matching_Experiments['Antigen'].unique()):
    subprocess.Popen(f"cat ./storage/mm9/*_{i}_* > ./storage/{i}.bed", shell=True)

This command concatenates all the files matching the pattern exps/hg38/*_{i}_* and redirects the output to a file named {i}.bed in the omics directory.


The algorithm performs file concatenation based on the unique values of the 'Antigen' field from the matching experiments. Each concatenated file is saved as {i}.bed in the omics directory.

In [10]:
!chmod -R a+rw .