## Some initial setup steps

### Step 1:  Select the Ruby kernel
IF YOU SEE THE WORD "Ruby" in the top right of your screen, go directly to Step 2 :-)

IF YOU DO NOT SEE THE WORD "Ruby" in the top right side of your screen, you need to set the Ruby kernel for this demo.  In the menu bar at the top of this page, click on "kernel" --> "Change Kernel" --> "Ruby 3.x.x"


### Step 2:  Set-up the analytics environment

This demo has been coded to request the number of Duchenne and Becker patients in the DPP.  We first need to do some "housekeeping" so that our environment can make reequests over the web and plot them...


In [None]:
require 'daru/view'
require 'rest-client'

Daru::View.plotting_library = :googlecharts

puts  "thanks!  Go to the next box now :-)"

## Call the interface

All we need to do is call the URL of the "phenotype frequencies" service for each registry.  The output of this box is intended only to demonstrate that no sensitive information is output from these services.


In [None]:
phenocsv = RestClient.get('https://www.fairdata.services/proxy/grlc/phenotype-frequencies/phenotype-frequencies')
enmdcsv = RestClient.get('https://zks-docker.ukl.uni-freiburg.de/grlc-euronmd/api-local/phenotype-frequencies')
dpp_phenotype_hash = Hash.new
enmd_phenotype_hash = Hash.new


phenocsv.body.split[2..].each do |tmp|
    dpp_phenotype_hash[tmp.split(',')[0]] = tmp.split(',')[1]
end
enmdcsv.body.split[2..].each do |tmp|
    enmd_phenotype_hash[tmp.split(',')[0]] = tmp.split(',')[1]
end

# Print the hash for DPP
puts "DPP phenotype count"
print dpp_phenotype_hash
puts 

# Print the hash for EURO-NMD
puts
puts "EURO-NMD phenotype count"
print enmd_phenotype_hash
puts


## Calculate the total number of counts for each phenotype in each registry

To get an idea on how similar DPP and Euro-NMD are regarding the number of phenotypes observed in all patients.


In [None]:
#Calculate the total amount of phenotype frequencies in DPP
dpp_total_phenotypes = dpp_phenotype_hash.values.map(&:to_i).sum
puts "DPP total amount of phenotypes: #{dpp_total_phenotypes}"

#Calculate the total amount of phenotype frequencies in EURO-NMD
enmd_total_phenotypes = enmd_phenotype_hash.values.map(&:to_i).sum
puts "EURO-NMD total amount of phenotypes: #{enmd_total_phenotypes}"

## Find the common phenotypes for both registries
Next, we will compare the phenotypes themselves, to check which of them are present in both DPP and Euro-NMD

In [None]:
puts "Common phenotypes"
common_phenotypes = dpp_phenotype_hash.keys & enmd_phenotype_hash.keys

## Show the frequencies for the shared phenotypes, as well as their relative frequencies
Since both registries have a considerable difference in the overall number of patients, we will calculate the relative frequencies in each registry to get a better comparison between them:

In [None]:
dpp_common_freqs_hash = Hash.new
enmd_common_freqs_hash = Hash.new
dpp_rel_freqs_hash = Hash.new
enmd_rel_freqs_hash = Hash.new

# Print the common phenotypes and their frequencies
puts "DPP common phenotypes"
common_phenotypes.each do |pheno|
    freq = dpp_phenotype_hash[pheno].to_i
    rel_freq = (freq.to_f/dpp_total_phenotypes.to_f).round(3)
    puts "Phenotype: #{pheno};  Frequency: #{freq}; Relative frequency: #{rel_freq}"
    dpp_common_freqs_hash[pheno] = freq
    dpp_rel_freqs_hash[pheno] = rel_freq
end

puts "EURO-NMD common phenotypes"
common_phenotypes.each do |pheno|
    freq = enmd_phenotype_hash[pheno].to_i
    rel_freq = (freq.to_f/enmd_total_phenotypes.to_f).round(3)
    puts "Phenotype: #{pheno};  Frequency: #{freq}; Relative frequency: #{rel_freq}"
    enmd_common_freqs_hash[pheno] = freq
    enmd_rel_freqs_hash[pheno] = rel_freq
end

## Analytics
Here is a simple plot of the frecuencies of the shared phenotypes

In [None]:
data_rows = []
common_phenotypes.each do |pheno|
    phenolabel = pheno.gsub(/.*?\/(\w+)$/, "#{$1}")
    data_rows.append ["DPP #{phenolabel}", dpp_common_freqs_hash[pheno]]
    data_rows.append ["ENMD #{phenolabel}", enmd_common_freqs_hash[pheno]]
end

index = Daru::Index.new ['Phenotype', 'Number of people with the phenotype',]
frame = Daru::DataFrame.rows(data_rows)
frame.vectors = index
table =  Daru::View::Table.new(frame)

options =  { title: 'Phenotype frequencies',
             type: :bar,
             height: 500

}
chart = Daru::View::Plot.new(table.table, options)
chart.show_in_iruby

## Analytics 2
Now, let's compare the relative frecuencies of those same phenotypes

In [None]:

data_rows = []
common_phenotypes.each do |pheno|
    phenolabel = pheno.gsub(/.*?\/(\w+)$/, "#{$1}")
    data_rows.append ["DPP #{phenolabel}", dpp_rel_freqs_hash[pheno]]
    data_rows.append ["ENMD #{phenolabel}", enmd_rel_freqs_hash[pheno]]
end


index = Daru::Index.new ['Phenotype', 'Relative phenotype frecuency',]
frame = Daru::DataFrame.rows(data_rows)
frame.vectors = index
table =  Daru::View::Table.new(frame)

options =  { title: 'Relative phenotype frequencies',
             type: :bar,
             height: 500

}
chart = Daru::View::Plot.new(table.table, options)
chart.show_in_iruby