## OmicPythonApi Tutorial

Contact: Wilson Chen, Omicsoft, 2018-3-21

### Introduction
OmicPythonApi was designed to be a python package that allows users to run oscript via oshell within Python. Input to OmicPythonApi normally is identical to the corresponding oscripts. The output of OmicPythonApi totally depends on the oscripts behind OmicPythonApi.

Users may retrieve the oscripts of an OmicPythonApi object in its Oscript field. Users may refer to Omicsoft Wiki for input parameters and output of the corresponding oscripts.

To perform land queries, users may use the apis in this package to download query results to text files, then use other Python packages ( (Pandas etc) to load the text files and perform downstream analysis.

OmicPythonApi requires Oshell and Python 3.6.

OmicPythonApi was tested in Windows 10 and Centos7.

### Instruction
1. Unzip OmicPythonApi.zip to a desired location (called InstallationFolder). Please do NOT rename the unzipped OmicPythonApi folder under InstllationFolder.
2. Update the mono and oshell path in OmicPythonApi.cfg file. If Oshell was installed in windows, either comment MonoPath line with //, or set it to a space character.
3. Include the following lines into your python code in order to import OmicPythonApi package:
```
	InstallationFolder=r'Z:\test'
    import site
	import os
	site.addsitedir(InstallationFolder)
	import OmicPythonApi as Oapi
```
3. Use the Oapi as a regular python packages in your python script.

### Example
#### 1. Load OmicPythonApi package and other packages

In [17]:
#Load OmicPythonApi package
InstallationFolder=r'Z:\Users\wilson\document\Script_Library\Python\OmicPythonApi_Development'
import site
site.addsitedir(InstallationFolder)
import OmicPythonApi as Oapi

import os
import pandas as pd

#### 2. Set input parameters that will be used for different apis for this tutorial

In [18]:
ServerAddress=r'tcp://192.168.3.226:9065'
ServerUserId='omicsoft'
ServerUserPsw='omicsoft'
SampleSet='OmicsoftTest_SampleSet_20171128'
ProjectPattern='2016'
OutputFolder='/local_data/tem'

ProjectId='20180314_test1'

FastqFilePath="""/IData/Users/QA_QC_Team/input/StandardDataset/RNA-Seq/Fastq/TestData1.1.fastq.gz
/IData/Users/QA_QC_Team/input/StandardDataset/RNA-Seq/Fastq/TestData1.2.fastq.gz"""

OutputFile=r'Z:\Users\wilson\tem1\20180308.txt'
InputTableFile=r'Z:\Users\wilson\tem1\TestSample.FullMetaData.txt'
OutputFolderAbsPath=r'Z:\Users\wilson\tem1'
SampleSetFilePath=r'Z:\Users\wilson\document\Script_Library\Python\OmicPythonApi_001\OmicPythonApi\examples\testData\SampleNames_TCGA.txt'


#### 3. List all lands in an array server and export results to a text file. The test file can be imported using Pandas. 

In [19]:
a1=Oapi.ListLands()
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1=pd.read_csv(os.path.join(OutputFolderAbsPath,'ListLands.txt'),sep='\t',header=None)
df1.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,0
0,Blueprint_B37
1,Blueprint_B38
2,CCLE_B37
3,CCLE_B38
4,CCLE_PatternCNV_B37


#### 4. List samplesets and export all samplesets in a land to a text file. The file can be loaded into Python using pandas.

In [20]:
a1=Oapi.ListSampleSets()
a1.Land='TCGA_B37'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();

df1=pd.read_csv(os.path.join(OutputFolderAbsPath,'ListSampleSets.txt'),sep='\t')
df1.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,SampleID,Tag
0,TCGA_1Sample,Wilson
1,Rnaseq,Wilson
2,UCS,Wilson
3,DLBC,Wilson
4,OmicPythonApi_Test1,Wilson_test_20180314


#### 5. Query data in TCGA_B37 using sample ids and export results to text files. See more details in wiki for oscript TextDumpArrayLandSampleData.

In [21]:
a1=Oapi.TextDumpArrayLandSampleData()
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Land='TCGA_B37'
a1.DataMode='Expression_Ratio'
a1.Sample=r'TCGA-01-0628-11A,TCGA-01-0629-11A'
a1.DownloadFullMetaData='True'
a1.OutputFolder=OutputFolderAbsPath
a1.Output='TestSample'
a1.Run();

os.listdir(a1.OutputFolder);

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


#### 6. Query TCGA_B37 using sample IDs and gene sets, and export results to text files.

In [22]:
a1=Oapi.TextDumpArrayLandGeneData()
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Land='TCGA_B37'
a1.DataMode='Expression_Ratio'
a1.SampleSet=r'OmicPythonApi_Test1'
a1.GeneSet='erg,mdm2'
a1.DownloadFullMetaData='True'
a1.OutputFolder=OutputFolderAbsPath
a1.Output='TestGeneSet'
a1.Run();

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


#### 7. Download all samplesets in TCGA_B37 and export results to a text file. Load the text file with Pandas.

In [23]:
a1=Oapi.DownloadSampleSet()
a1.Land='TCGA_B37'
a1.SampleSet='OmicPythonApi_Test1'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run()
df1=pd.read_csv(os.path.join(a1.OutputFolder,'DownloadSampleSet.txt'),sep='\t',header=None)
df1.head().iloc[:,0]

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


0    TCGA-3L-AA1B-01A
1    TCGA-4N-A93T-01A
2    TCGA-4T-AA8H-01A
3    TCGA-5M-AAT4-01A
4    TCGA-5M-AAT6-01A
Name: 0, dtype: object

#### 8. Download all genesets in TCGA_B37 and export results to a text file. Load the text file with Pandas

In [24]:
a1=Oapi.DownloadGeneSet()
a1.Land='TCGA_B37'
a1.GeneSet='LandRApiTest'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1=pd.read_csv(os.path.join(a1.OutputFolder,'DownloadGeneSet.txt'),sep='\t',header=None)
df1.head().iloc[:,0]

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


0    fgf12
1     mdm2
2     braf
3     egfr
Name: 0, dtype: object

#### 9. Download metadata in TCGA_B37 and export results to a text file. Load the text file with Pandas

In [25]:
a1=Oapi.DownloadMetaData()
a1.Land='TCGA_B37'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1=pd.read_csv(os.path.join(a1.OutputFolder,'DownloadMetaData.txt'),sep='\t',header=0)
df1.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,ID,Age At Initial Pathologic Diagnosis,BamFileName,Bcr Patient Uuid,Bcr Sample Uuid,Clinical M,Clinical N,Clinical Stage,Clinical T,Disease,...,Study Source,SubjectID,Survival Days,Survival Status,TissueCategory,Tissue,Tumor Necrosis Percent,Tumor Nuclei Percent,Tumor Or Normal,Tumor Type
0,TCGA-01-0628-11A,.,,,2a91428d-7099-4635-b059-9e5bad09b68a,,,,,Ovarian serous cystadenocarcinoma,...,TCGA,TCGA-01-0628,.,,female reproductive system,ovary,,,Normal,OV
1,TCGA-01-0630-11A,.,,,16727da1-acde-4e5d-898e-18497b35507e,,,,,Ovarian serous cystadenocarcinoma,...,TCGA,TCGA-01-0630,.,,female reproductive system,ovary,,,Normal,OV
2,TCGA-01-0631-11A,.,,,6848450a-7e7d-4577-bf2d-c802ccc7d82c,,,,,Ovarian serous cystadenocarcinoma,...,TCGA,TCGA-01-0631,.,,female reproductive system,ovary,,,Normal,OV
3,TCGA-01-0633-11A,.,,,0845ecac-bcf5-4c1e-a467-e7ba71d52118,,,,,Ovarian serous cystadenocarcinoma,...,TCGA,TCGA-01-0633,.,,female reproductive system,ovary,,,Normal,OV
4,TCGA-01-0636-11A,.,,,2e354f69-d728-4aba-8f54-7987a8566da8,,,,,Ovarian serous cystadenocarcinoma,...,TCGA,TCGA-01-0636,.,,female reproductive system,ovary,,,Normal,OV


#### 10. List data in TCGA_B37 and export results to a text file. Load the text file with Pandas

In [26]:
a1=Oapi.ListDataAvailability()
a1.Land='TCGA_B37'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1=pd.read_csv(os.path.join(a1.OutputFolder,'ListDataAvailability.txt'),sep='\t',header=0)
df1.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,ID,CNV,CNVCall,DNA-Seq Somatic Mutation,Expression Ratio,MS,Methylation450 B37,MiRNA-Seq Normalized Count,RPPA,RPPA RBN,RNA-Seq Exon,RNA-Seq Exon Junction,RNA-Seq Fusion,RNA-Seq Gene Details,RNA-Seq Paired End Fusion,RNA-Seq Somatic Mutation,RNA-Seq Gene Quantification
0,ACC All,180,90,91,0,0,80,80,46,46,79,79,79,79,79,0,79
1,ACC Control,90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,BLCA All,808,408,396,0,0,437,432,344,127,430,430,430,430,430,20,430
3,BLCA Control,395,0,0,0,0,21,19,0,0,19,19,19,19,19,0,19
4,BRCA All,2217,1080,1002,593,102,893,1202,937,820,1227,1227,1227,1227,1227,120,1227


#### 11. Download gene comparison from HumanDisease_B37 given comparisonset. Load the resulted text files using Pandas.

In [27]:
a1=Oapi.DownloadArrayLandComparisonData()
a1.Land='HumanDisease_B37'
a1.ComparisonSet=r'GSE13887.GPL570.test1,GSE13849.GPL570.test1,GSE13849.GPL570.test2'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1_CpGenes=pd.read_csv(os.path.join(a1.OutputFolder,a1.Output+'.Comparison.Genes.txt'),sep='\t',header=0)
df1_CpGenes_Design=pd.read_csv(os.path.join(a1.OutputFolder,a1.Output+'.Comparison.Genes.txt'),sep='\t',header=0)
df1_Cp=pd.read_csv(os.path.join(a1.OutputFolder,a1.Output+'.Comparison.txt'),sep='\t',header=0)
df1_Meta=pd.read_csv(os.path.join(a1.OutputFolder,a1.Output+'.FullMetaData.txt'),sep='\t',header=0,low_memory=False)
df1_CpGenes.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,ID,GSE13887.GPL570.test1,GSE13849.GPL570.test1,GSE13849.GPL570.test2
0,DDX11L1,1.0935,0.1685,0.2511
1,WASH7P,0.2277,-0.2983,-0.2595
2,MI0022705,.,.,.
3,MI0006363,.,.,.
4,FAM138F,0.3401,0.0599,0.0650


#### 12. Download gene comparison from HumanDisease_B37 given comparisonset and GeneSet. Load the resulted text files using Pandas.

In [28]:
a1=Oapi.TextDumpArrayLandGeneComparison()
a1.Land='HumanDisease_B37'
a1.GeneSet='MET,egfr,braf,Kras'
a1.ComparisonSet=r'GSE13887.GPL570.test1,GSE13849.GPL570.test1,GSE13849.GPL570.test2'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1_Cp=pd.read_csv(os.path.join(a1.OutputFolder,a1.Output+'.Comparison.txt'),sep='\t',header=0)
df1_CpMx=pd.read_csv(os.path.join(a1.OutputFolder,'Comparison.matrix.txt'),sep='\t',header=0)
df1_Meta=pd.read_csv(os.path.join(a1.OutputFolder,a1.Output+'.FullMetaData.txt'),sep='\t',header=0,low_memory=False)

df1_Cp.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,ID,ComparisonID,GeneID,GeneName,EntrezID,Probe,Log2FoldChange,RawPValue,AdjustedPValue,NumeratorValue,DenominatorValue
0,1,GSE13887.GPL570.test1,MET,MET,4233,213816_s_at,0.248175,0.341515,0.704033,6.180288,5.932113
1,2,GSE13887.GPL570.test1,MET,MET,4233,211599_x_at,0.075648,0.58657,0.839611,8.631186,8.555538
2,3,GSE13887.GPL570.test1,MET,MET,4233,203510_at,-0.290123,0.703461,0.892646,7.263002,7.553125
3,4,GSE13887.GPL570.test1,MET,MET,4233,213807_x_at,0.027645,0.904087,0.967445,7.073084,7.04544
4,5,GSE13849.GPL570.test1,MET,MET,4233,203510_at,-0.189345,0.085426,0.28879,5.629444,5.818789


#### 13. Download comparisonset from HumanDesease_B37 and save results in a text file. Load the text file with Pandas

In [29]:
#Run DownloadComparisonSet api, which which produce text files based on query results in HumanDisease_B37'.
a1=Oapi.DownloadComparisonSet()
a1.Land='HumanDisease_B37'
a1.ComparisonSet=r'WilsonComparisonSetTest'
a1.OutputFolder=OutputFolderAbsPath
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.Run();
df1_CpSet=pd.read_csv(os.path.join(a1.OutputFolder,'DownloadComparisonSet.txt'),sep='\t',header=0)
df1_CpSet.head().iloc[:,0]

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


0    GSE13849.GPL570.test1
1    GSE13849.GPL570.test2
Name: ComparisonID, dtype: object

#### 14. Fetch sampleset stored in ArrayServer, and export results to a text file. Load the text file with Pandas

In [30]:
a1=Oapi.FetchSampleSetMetaData()
a1.ServerAddress=ServerAddress
a1.ServerUserId=ServerUserId
a1.ServerUserPsw=ServerUserPsw
a1.SampleSetId=SampleSet
a1.OutputFile=OutputFile
a1.Run()
df1=pd.read_csv(a1.OutputFile,sep='\t',header=1)
df1.head()

Load configuration...
Load Oscript template
Make oscript
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


Unnamed: 0,SampleSetID,ContactEmail,ContactName,Description,Keywords,LoadedBy,LoadedDate,Organism,PlatformType,Title
0,OmicsoftTest_SampleSet_20171128,support@omicsoft.com,Omicsoft,test,Test,Omicsoft,11/28/2017,Rat,Expression Array,OmicsoftTest SampleSet 20171128


#### 15. Run a custom oscript

In [31]:
a1=Oapi.OmicPythonApi()
a1.Oscript=r"""Begin ExecuteCommand /Namespace=Server;
Server "tcp://192.168.3.226:9065" /UserID=chenx57 /Password=omicsoft;
Command IA_FetchSampleIDs;
Options
"
SampleSetID=OmicsoftTest_SampleSet_20171128
";
OutputFile "Z:\Users\wilson\support\oscript\output\test_20180321.txt";
End;"""
a1.Run()

Load configuration...
Lauching Oshell and Running Oscript
Oshell run completed, no error message.


'OShell version=10.0.1.50\r\nProgram started at Wednesday, March 21, 2018 1:16:25 PM\r\nPerform initialization...\r\n\r\n[00:00:00] Windows OS detected...Initialization done...\r\nTempFolder=D:\\local\\data\\OmicsoftFiles\\Temp\\2018.3.21.c773c2d568a98215\r\nScript used: \r\nBegin ExecuteCommand /Namespace=Server;\nServer "tcp://192.168.3.226:9065" /UserID=chenx57 /Password=omicsoft;\nCommand IA_FetchSampleIDs;\nOptions\n"\nSampleSetID=OmicsoftTest_SampleSet_20171128\n";\nOutputFile "Z:\\Users\\wilson\\support\\oscript\\output\\test_20180321.txt";\nEnd;\r\n\r\nExecuting scripts...\r\n\r\n[00:00:00] ----------Started ProcExecuteCommand----------\r\n\r\n[00:00:00] Fetching user profile...\r\n[00:00:00] Uploading input data....\r\n[00:00:00] Executing command on the server...\r\n[00:00:01] Downloading output data...\r\n[00:00:01] ----------Finished ProcExecuteCommand----------\r\nProgram finished at Wednesday, March 21, 2018 1:16:26 PM\r\nTemp folder: D:\\local\\data\\OmicsoftFiles\\Temp\

#### 16. Show log of the oscript job by print obj.Log

In [32]:
print(a1.Log)

OShell version=10.0.1.50
Program started at Wednesday, March 21, 2018 1:16:25 PM
Perform initialization...

[00:00:00] Windows OS detected...Initialization done...
TempFolder=D:\local\data\OmicsoftFiles\Temp\2018.3.21.c773c2d568a98215
Script used: 
Begin ExecuteCommand /Namespace=Server;
Server "tcp://192.168.3.226:9065" /UserID=chenx57 /Password=omicsoft;
Command IA_FetchSampleIDs;
Options
"
SampleSetID=OmicsoftTest_SampleSet_20171128
";
OutputFile "Z:\Users\wilson\support\oscript\output\test_20180321.txt";
End;

Executing scripts...

[00:00:00] ----------Started ProcExecuteCommand----------

[00:00:00] Fetching user profile...
[00:00:00] Uploading input data....
[00:00:00] Executing command on the server...
[00:00:01] Downloading output data...
[00:00:01] ----------Finished ProcExecuteCommand----------
Program finished at Wednesday, March 21, 2018 1:16:26 PM
Temp folder: D:\local\data\OmicsoftFiles\Temp\2018.3.21.c773c2d568a98215 deleted at machine: DESKTOP-P81RJ85