## Some initial setup steps

This demo has been coded to request the phenotype frecuencies of patients in the DPP and Euro-NDM.  We first need to do some "housekeeping" so that our environment can make reequests over the web and plot them...

In [None]:
require 'daru/view'
require 'rest-client'

Daru::View.plotting_library = :googlecharts

puts  "thanks!  Go to the next box now :-)"

## Call the interface

All of the private components are constantly running on the DPP and Euro-NMD servers, so we do not need to do anything in that regard.

All we need to do is call the URL of the Secure Shell proxy

In [None]:
phenocsv = RestClient.get('http://fairdata.services:8088/api-local/phenotype-frequencies')
enmdcsv = RestClient.get('https://zks-docker.ukl.uni-freiburg.de/grlc-euronmd/api-local/phenotype-frequencies')
dpp_phenotype_hash = Hash.new
enmd_phenotype_hash = Hash.new


phenocsv.body.split[2..].each do |tmp|
    dpp_phenotype_hash[tmp.split(',')[0]] = tmp.split(',')[1]
end
enmdcsv.body.split[2..].each do |tmp|
    enmd_phenotype_hash[tmp.split(',')[0]] = tmp.split(',')[1]
end

# Print the hash for DPP
puts "DPP phenotype count"
print dpp_phenotype_hash
puts 

# Print the hash for EURO-NMD
puts
puts "EURO-NMD phenotype count"
print enmd_phenotype_hash
puts


## Calculate the total amount of phenotypes in each registry
To get an idea on how similar DPP and Euro-NMD are regarding the number of phenotypes all patients have.

In [None]:
#Calculate the total amount of phenotype frecuencies in DPP
dpp_total_phenotypes = dpp_phenotype_hash.values.map(&:to_i).sum
puts "DPP total amount of phenotypes: #{dpp_total_phenotypes}"

#Calculate the total amount of phenotype frecuencies in EURO-NMD
enmd_total_phenotypes = enmd_phenotype_hash.values.map(&:to_i).sum
puts "EURO-NMD total amount of phenotypes: #{enmd_total_phenotypes}"

## Find the common phenotypes for both registries
Next, we will compare the phenotypes themselves, to check which of them are present in both DPP and Euro-NMD

In [None]:
puts "Common phenotypes"
common_phenotypes = dpp_phenotype_hash.keys & enmd_phenotype_hash.keys

## Show the phenotype frecuencies for the shared phenotypes, as well as their relative frequencies
Since both registries have a considerable difference in the amount of patients whose phenotype information is stored in the database, we will calculate the relative frequencies to get a better comparison between them

In [None]:
dpp_common_freqs_hash = Hash.new
enmd_common_freqs_hash = Hash.new
dpp_rel_freqs_hash = Hash.new
enmd_rel_freqs_hash = Hash.new

# Print the common phenotypes and their frequencies
puts "DPP common phenotypes"
common_phenotypes.each do |pheno|
    freq = dpp_phenotype_hash[pheno].to_i
    rel_freq = (freq.to_f/dpp_total_phenotypes.to_f).round(3)
    puts "Phenotype: #{pheno};  Frecuency: #{freq}; Relative frequency: #{rel_freq}"
    dpp_common_freqs_hash[pheno] = freq
    dpp_rel_freqs_hash[pheno] = rel_freq
end

puts "EURO-NMD common phenotypes"
common_phenotypes.each do |pheno|
    freq = enmd_phenotype_hash[pheno].to_i
    rel_freq = (freq.to_f/enmd_total_phenotypes.to_f).round(3)
    puts "Phenotype: #{pheno};  Frecuency: #{freq}; Relative frequency: #{rel_freq}"
    enmd_common_freqs_hash[pheno] = freq
    enmd_rel_freqs_hash[pheno] = rel_freq
end

## Analytics
Here is a simple plot of the frecuencies of the shared phenotypes

In [None]:
data_rows = [
  ['DPP HP:0030193', dpp_common_freqs_hash["http://purl.obolibrary.org/obo/HP_0030193"]],
  ['ENMD HP:0030193', enmd_common_freqs_hash["http://purl.obolibrary.org/obo/HP_0030193"]],
  ['DPP HP:0008366', dpp_common_freqs_hash["http://purl.obolibrary.org/obo/HP_0008366"]],
  ["ENMD HP:0008366", enmd_common_freqs_hash["http://purl.obolibrary.org/obo/HP_0008366"]],
  ['DPP HP:0002650', dpp_common_freqs_hash["http://purl.obolibrary.org/obo/HP_0002650"]],
  ["ENMD HP:0002650", enmd_common_freqs_hash["http://purl.obolibrary.org/obo/HP_0002650"]],
  ]
  index = Daru::Index.new ['Phenotype', 'Number of people with the phenotype',]
  frame = Daru::DataFrame.rows(data_rows)
  frame.vectors = index
  table =  Daru::View::Table.new(frame)
  
  options =  { title: 'Phenotype frequencies',
               type: :bar,
               height: 500
                
  }
  chart = Daru::View::Plot.new(table.table, options)
  chart.show_in_iruby

## Analytics 2
Now, let's compare the relative frecuencies of those same phenotypes

In [None]:
data_rows = [
  ['DPP HP:0030193', dpp_rel_freqs_hash["http://purl.obolibrary.org/obo/HP_0030193"]],
  ['ENMD HP:0030193', enmd_rel_freqs_hash["http://purl.obolibrary.org/obo/HP_0030193"]],
  ['DPP HP:0008366', dpp_rel_freqs_hash["http://purl.obolibrary.org/obo/HP_0008366"]],
  ["ENMD HP:0008366", enmd_rel_freqs_hash["http://purl.obolibrary.org/obo/HP_0008366"]],
  ['DPP HP:0002650', dpp_rel_freqs_hash["http://purl.obolibrary.org/obo/HP_0002650"]],
  ["ENMD HP:0002650", enmd_rel_freqs_hash["http://purl.obolibrary.org/obo/HP_0002650"]],
  ]
  index = Daru::Index.new ['Phenotype', 'Relative phenotype frecuency',]
  frame = Daru::DataFrame.rows(data_rows)
  frame.vectors = index
  table =  Daru::View::Table.new(frame)
  
  options =  { title: 'Relative phenotype frequencies',
               type: :bar,
               height: 500
                
  }
  chart = Daru::View::Plot.new(table.table, options)
  chart.show_in_iruby