# Project notebook

- Easy access to all the experimental data generated in the lab
- All the results from statistical analysis
- Visualization of reports
- All the the python functionality at hand

## Library import

In [4]:
from report_manager import project, analysisResult
from plotly.offline import init_notebook_mode, iplot
import missingno as msno
import warnings

warnings.filterwarnings('ignore')
init_notebook_mode(connected=True)
%matplotlib inline

## Creating a Project object

- Connects to the database
- Extracts all the project information depending on the data types: Cinical, Proteomics, Whole-exome sequencing, etc.
- Runs all the default analyses
- Returns all datasets and analyses results, and plots

In [5]:
p = project.Project('P0000001', datasets=None, report={})

Imputing row 1/10 with 27 missing, elapsed time: 0.001
Imputing row 1/10 with 22 missing, elapsed time: 0.001
Imputing row 1/10 with 31 missing, elapsed time: 0.001
Imputing row 1/10 with 27 missing, elapsed time: 0.001
Imputing row 1/8 with 31 missing, elapsed time: 0.000
pickSoftThreshold: will use block size 511.


 pickSoftThreshold: calculating connectivity for given powers...


   ..working on genes 1 through 511 of 511


  
 Power
 SFT.R.sq
 slope
 truncated.R.sq
  mean.k.
 median.k.
  max.k.

1 
     1
    0.456
 -3.67
         0.9460
 71.80000
  6.99e+01
 111.000

2 
     2
    0.714
 -3.19
         0.9830
 15.90000
  1.49e+01
  36.400

3 
     3
    0.850
 -2.71
         0.9910
  4.54000
  3.97e+00
  14.800

4 
     4
    0.900
 -2.45
         0.9560
  1.55000
  1.22e+00
   6.930

5 
     5
    0.944
 -2.19
         0.9860
  0.60800
  4.21e-01
   3.670

6 
     6
    0.973
 -1.85
         0.9710
  0.27300
  1.57e-01
   2.170

7 
     7
    0.315
 -3.07
         0.1940
  0.139

## Visualizing the Project report

In [6]:
plots = p.show_report("notebook")

PlotlyError: The `figure_or_data` positional argument must be `dict`-like, `list`-like, or an instance of plotly.graph_objs.Figure

## Access to datasets

### Clinical data

In [None]:
clin_dataset = p.get_dataset('clinical').get_dataset('preprocessed')
clin_dataset.head()

#### Further dataset manipulation and visualization

In [None]:
clin_dataset = clin_dataset.drop(['group'], axis=1).pivot_table(index='subject', columns='clinical_variable', values='value', aggfunc='first')
msno.matrix(clin_dataset)

### Proteomics dataset (original)

In [None]:
dataset = p.get_dataset("proteomics").get_dataset("dataset")

In [None]:
dataset.head()

In [None]:
dataset = dataset.drop(['group'], axis=1).pivot_table(index='sample', columns='identifier', values='LFQ_intensity', aggfunc='first')
msno.matrix(dataset)

In [None]:
preprocessed_dataset = p.get_dataset('clinical').get_dataset('preprocessed')
preprocessed_dataset.head()

### Proteomics dataset (imputed)

In [None]:
reg_dataset = p.get_dataset("proteomics").get_dataset("regulation_table")
reg_dataset.loc[reg_dataset['identifier'] == 'O60341-KDM1A',:]

In [None]:
result = analysisResult.AnalysisResult("Mapper analysis", analysis_type = "mapper", 
                           args = {"n_cubes": 15,
                                   "overlap": 0.85,
                                   "n_clusters": 2,
                                   "linkage": "single",
                                  "title":"Topological data analysis - Sample stratification"}, data=reg_dataset)
mapper_plot = result.get_plot(name="mapper", identifier="mapper_plot")[0]

In [None]:
iplot(mapper_plot.figure)

## Analyses results

#### Differential regulation

In [None]:
reg_table = p.get_dataset("proteomics").get_dataset("regulation_table")
reg_table.head()