Skip to content

Commit

Permalink
Dev (#96)
Browse files Browse the repository at this point in the history
* Adding functionality for nesting algorithms (#84)

* Fixing the input graph feature.

* ON the input graph feature.

* Passing seed number to the tetrad modules.

* Adding discrete data examples for boss and grasp.

* Fixing boss and grasp json schemas.

* Added test for grasp.

* Updating docs and conf.

* Fixed bug in benchpress module when empty graph files are written.


---------

Co-authored-by: Mohamad Elmasri <melmasri@users.noreply.github.com>
Co-authored-by: Mohamad <mo@julia>
  • Loading branch information
3 people committed Oct 26, 2023
1 parent 423ce41 commit 5a43979
Show file tree
Hide file tree
Showing 5 changed files with 20 additions and 29 deletions.
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.4.0
2.5.0
17 changes: 3 additions & 14 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@
available_structure_learning_algorithms
available_evaluations


.. toctree::
:hidden:
:maxdepth: 1
Expand All @@ -44,14 +43,6 @@

module_add

.. .. toctree::
.. :hidden:
.. :maxdepth: 3
.. :name: File formats
.. :caption: File formats
.. data_formats
.. toctree::
:hidden:
:maxdepth: 3
Expand All @@ -71,12 +62,10 @@

------------------------


##################################
Benchpress
##################################


Describing the relationship between the variables in a study domain and modelling
the data generating mechanism is a fundamental problem in many empirical sciences.
`Probabilistic graphical models <https://en.wikipedia.org/wiki/Graphical_model>`_ are one common approach to tackle the problem.
Expand All @@ -97,12 +86,12 @@ generated datasets, the workflow also includes a number of standard datasets and
* The paper :footcite:t:`rios2021benchpress`
* The `GitHub <https://github.com/felixleopoldo/benchpress>`_ repository
* This `Medium story <https://medium.com/@felixleopoldorios/structure-learning-using-benchpress-826847db0aa8>`_
* This tutorial `UAI 2023 Tutorial: Structure Learning Using Benchpress - YouTube <https://www.youtube.com/watch?v=tx3hIH3b9Hg>`_
* The `Discord <https://discord.com/channels/1007933286724685824/1007933287215411284>`_ chat for any kind of questions

* This video tutorial `UAI 2023 Tutorial: Structure Learning Using Benchpress - YouTube <https://www.youtube.com/watch?v=tx3hIH3b9Hg>`_
* The `Discord <https://discord.com/channels/1007933286724685824/1007933287215411284>`_ chat for any kind of discussions etc.

.. rubric:: Updates

* 2023-10-13: Benchpress 2.5.0 released. Added the feature to pass the graph estimate of one algorithm as input of another. Added the algorithm module :ref:`athomas_jtsamplers` for MCMC estimating graphs of undirected decomposable graphical models.
* 2023-09-24: Benchpress 2.4.0 released. Added the Psi-learner algorithm for learning graphs of undirected Gaussian graphical models (:ref:`equsa_psilearner`).
* 2023-09-19: Benchpress 2.3.0 released. Updated causal-cmd to version 1.10.0. Added the BOSS algorithm (:ref:`tetrad_boss`).
* 2023-09-08: Benchpress 2.2.0 released. Now supporting the `ARM64 <https://en.wikipedia.org/wiki/AArch64>`_ architecture used e.g. by the recent Apple computers.
Expand Down
4 changes: 2 additions & 2 deletions workflow/rules/data/iid/rules.smk
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ rule sample_bin_bn_data:
output:
data="{output_dir}/data" \
"/adjmat=/{adjmat}"\
"/parameters=/bin_bn/{bn}"\
"/parameters=/bin_bn/{bn}"\
"/data=/"+pattern_strings["iid"] + "/" \
"seed={replicate}.csv"
wildcard_constraints:
Expand Down Expand Up @@ -107,7 +107,7 @@ rule sample_data_fixed_bnfit:
data="{output_dir}/data/adjmat=/{adjmat}/parameters=/bn.fit_networks/{bn}/data=/"+pattern_strings["iid"]+"/seed={replicate}.csv"
wildcard_constraints:
n="[0-9]*",
bn=".*\.rds"
bn=".*\.rds"
shell:
"Rscript workflow/rules/data/iid/sample_from_bnlearn_bn.R " \
"--filename {output.data} " \
Expand Down
8 changes: 4 additions & 4 deletions workflow/rules/evaluation/benchmarks/path_generators.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def summarise_alg_input_adjmat_est_path(algorithm):
"adjmat=/{adjmat}/"\
"parameters=/{bn}/"\
"data=/{data}/"\
"algorithm=/" + pattern_strings[algorithm] + "/" + \
"algorithm=/" + pattern_strings[algorithm] + "/" \
"seed={replicate}/" \
"adjmat.csv"

Expand All @@ -42,7 +42,7 @@ def summarise_alg_input_time_path(algorithm):
"adjmat=/{adjmat}/"\
"parameters=/{bn}/"\
"data=/{data}/" \
"algorithm=/" + pattern_strings[algorithm] + "/" + \
"algorithm=/" + pattern_strings[algorithm] + "/" \
"seed={replicate}/" \
"time.txt"

Expand All @@ -52,13 +52,13 @@ def summarise_alg_input_ntests_path(algorithm):
"adjmat=/{adjmat}/"\
"parameters=/{bn}/"\
"data=/{data}/" \
"algorithm=/" + pattern_strings[algorithm] + "/" + \
"algorithm=/" + pattern_strings[algorithm] + "/" \
"seed={replicate}/" \
"ntests.txt"

def summarise_alg_output_res_path(algorithm):
return "{output_dir}/result/"\
"algorithm=/" + pattern_strings[algorithm] + "/" + \
"algorithm=/" + pattern_strings[algorithm] + "/" \
"adjmat=/{adjmat}/"\
"parameters=/{bn}/"\
"data=/{data}/"\
Expand Down
18 changes: 10 additions & 8 deletions workflow/rules/evaluation/benchmarks/plot_ROC.R
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ fpr_tpr_pattern <- function(){
)
}
} + {

if (!param_annot) {
geom_point(
data = toplot, alpha = 0.5,
Expand All @@ -84,7 +83,7 @@ fpr_tpr_pattern <- function(){
col = id_numlev
), shape = 20,
size = 1
)
)
}
} + {
if (scatter && show_seed) {
Expand All @@ -106,7 +105,7 @@ fpr_tpr_pattern <- function(){
replace_na(list("curve_vals" = 0)) %>%
mutate(SHDP_pattern_median = 1 - TPR_pattern_median + FPRn_pattern_median) %>%
filter(SHDP_pattern_median == min(SHDP_pattern_median)),
alpha = 0.8, position = "dodge", alpha = 1, show.legend = FALSE,
alpha = 0.8, position = "dodge", show.legend = FALSE,
aes(
x = FPRn_pattern_median, y = TPR_pattern_median,
col = id_numlev, label = id_num
Expand All @@ -126,7 +125,7 @@ fpr_tpr_pattern <- function(){
)
}
} +
guides(shape = FALSE) +
guides(shape = "none") +
facet_wrap(. ~ adjmat + parameters + data + n_seeds, nrow = 2) +
{
if (!is.null(xlim)) {
Expand Down Expand Up @@ -929,9 +928,12 @@ if (file.info(snakemake@input[["csv"]])$size == 0) {
toplot <- read.csv(snakemake@input[["csv"]]) # Median, mean, quantiles, taken over the seeds
joint_bench <- read.csv(snakemake@input[["raw_bench"]]) # All raw benchmarks in one dataframe

replacement_list <- list(parameters = "NA") # converts NA to string "NA" in the dataframe
toplot[is.na(toplot)] <- "NA"
joint_bench[is.na(joint_bench)] <- "NA"
# ME: converting NA to sting causes mix types in a column
# R in this case converts all to string
# made an laternative fix below
# replacement_list <- list(parameters = "NA") # converts NA to string "NA" in the dataframe
# toplot[is.na(toplot)] <- "NA"
# joint_bench[is.na(joint_bench)] <- "NA"
#toplot <- toplot %>% replace_na(replacement_list)
#joint_bench <- joint_bench %>% replace_na(replacement_list)

Expand Down Expand Up @@ -997,7 +999,7 @@ if (file.info(snakemake@input[["csv"]])$size == 0) {
filter(adjmat == adjmat2) %>%
filter(parameters == parameters2) %>%
filter(data == data2)

if (nrow(joint_bench) > 0) {
fpr_tpr_pattern()
fpr_tpr_skel()
Expand Down

0 comments on commit 5a43979

Please sign in to comment.