Dockerised framework for DAFNI #251

f-allian · 2023-12-12T16:59:48Z

This PR contains an initial commit for the dockerisation of the causal testing framework, so that it can be hosted on DAFNI - closes issue #92 after review.

Steps to reproduce tests

This PR uses the vaccinating elderly example as a test-case. The entry point to the framework is wrapped in a script called main_dafni.py, which takes in 4 mandatory input arguments, including data_path, dag_path, tests_path, variables_path.

1. Without Docker

Simply run:

python main_dafni.py --variables_path $VARIABLES --dag_path $DAG_PATH --data_path $DATA_PATH --tests_path $CAUSAL_TESTS,

and point to the path containing your configuration files (for this example, everything is defined under ./dafni/inputs). The resultant causal tests will be saved in .json format in the folder ./dafni/outputs. (Note; the folder structure here is important for DAFNI)

2. With Docker

Create a .env file in the ./dafni directory containing the environment variables, which is then passed into the dockerfile for the build.
Then let docker-compose do the building of the image and creating the container by simply running docker-compose up.

Overall, the total execution time (building of the image and script execution) takes ~1 minute on my computer (this may take slightly longer on a new setup as there won't be any cached data).

Overall Progress

DAFNI / Docker #92
Create a wrapper of the causal testing framework to be used as an entry-point for Docker
Containerise the causal testing framework using Docker
Update the model_definition.yaml to contain the appropriate dataslot IDs (if any) needed for the execution of the framework
Conduct more tests using different examples (we should have plenty to test)
Upload to DANFI.

github-actions · 2023-12-12T17:00:48Z

🦙 MegaLinter status: ⚠️ WARNING

Descriptor	Linter	Files	Fixed	Errors	Elapsed time
⚠️ PYTHON	black	29		1	0.95s
✅ PYTHON	pylint	29		0	3.89s

See detailed report in MegaLinter reports

MegaLinter is graciously provided by

codecov · 2023-12-12T17:01:52Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (48fd185) 95.69% compared to head (5972be0) 95.69%.
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #251   +/-   ##
=======================================
  Coverage   95.69%   95.69%           
=======================================
  Files          22       22           
  Lines        1557     1557           
=======================================
  Hits         1490     1490           
  Misses         67       67

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4034519...5972be0. Read the comment docs.

jmafoster1 · 2023-12-13T10:27:11Z

dafni/inputs/variables.json

Perhaps I'm just not software engineering-y enough, but why do we need separate files for inputs and tests, each with a single key in for which the value is a list? Could we not either combine tests.json and variables.json into a single file, or have them each just contain a list of the values, or is that in some way bad practice?

A potential alternative to this would be to include this metadata in the dag file as attributes for each variable. That could make things simpler from a user's point of view as they'd then only have to make 1 file (the DAG, the tests can be built automatically from the DAG).

jmafoster1 · 2023-12-13T13:44:13Z

dafni/main_dafni.py

+        estimators = {"LinearRegressionEstimator": LinearRegressionEstimator}
+
+        # Step 3: Define the expected variables
+
+        expected_outcome_effects = {
+            "Positive": Positive(),
+            "Negative": Negative(),
+            "NoEffect": NoEffect(),
+            "SomeEffect": SomeEffect()}


Is there any way we can get round this hardcoding? This makes it pretty much impossible for a user to implement customisations, and is also the only blocker to having a single default main.py file (I think), now we're putting variables in a file.

jmafoster1 · 2024-01-03T08:27:11Z

@f-allian please can you post your .env file as an example? I've never made one before, so guessed at the following, but docker-compose up fails with error message main_dafni.py: error: argument --variables_path: expected one argument
My .env file:

VARIABLES=inputs/variables.json
DAG_PATH=inputs/dag.dot
DATA_PATH=inputs/simulated_data.csv
CAUSAL_TESTS=inputs/causal_tests.json

christopher-wild · 2024-01-03T08:45:54Z

Hi Michael,

I believe your VARIABLES line should instead be VARIABLES_PATH. i.e.

VARIABLES_PATH=inputs/variables.json
DAG_PATH=inputs/dag.dot
DATA_PATH=inputs/simulated_data.csv
CAUSAL_TESTS=inputs/causal_tests.json

otherwise your .env file is identical to the one Farhad shared with me

jmafoster1 · 2024-01-03T09:00:02Z

Thanks Chris, that works. Would it be sensible to commit the .env file as part of the example, or is there a reason why this isn't a good idea?

f-allian · 2024-01-03T09:27:53Z

Thanks Chris, that works. Would it be sensible to commit the .env file as part of the example, or is there a reason why this isn't a good idea?

@jmafoster1 There isn't a specific reason why I didn't include it in my commit in this case. Generally speaking, environment files typically contain passwords, API keys etc., which is why they're left out of the version controlling stage. I don't think DAFNI requires that to be specified in our case, so I can provide a template if you think that's useful.

Also, if you think some of my variable naming conventions aren't helpful/easily identifiable, please let me know!

jmafoster1 · 2024-01-03T09:31:42Z

I think it would be helpful to show an example/template. I'm happy with variable names, but we should update the "without docker" run command to include the _PATH suffix to be consistent.

rsomers1998 · 2024-01-03T10:44:45Z

dafni/Dockerfile

+RUN pip install causal-testing-framework --no-cache-dir
+
+# Use the necessaary environment variables for the script's inputs
+ENV VARIABLES=./inputs/variables.json \


I understand that these are overwritten by the .env but VARIABLES here should be VARIABLES_PATH. Also, if we're assuming a .env to be supplied, are setting these within the dockerfile necessary?

@rsomers1998 You were indeed correct! Thanks for flagging this.

christopher-wild

Looks really really good! Found some small clean ups

christopher-wild · 2024-01-03T11:17:28Z

dafni/Dockerfile

+WORKDIR /usr/src/app/
+
+# Install core dependencies using PyPi
+RUN pip install causal-testing-framework --no-cache-dir


Out of curiosity why is the --no-cache-dir flag used here? My guess would be that that there is no causal-testing-framework wheel in cache as it's a fresh container.

@cwild-UoS The --no-cache-dir option disables the downloading and storing of cached packages, which reduces the overall Docker image size. I haven't calculated what that difference is, but it's not too important for our purposes

Makes lots of sense! Good idea

christopher-wild · 2024-01-03T11:27:06Z

dafni/main_dafni.py

@@ -0,0 +1,210 @@
+import warnings
+warnings.filterwarnings("ignore", message=".*The 'nopython' keyword.*")


What warning does this filter out? I tried running the script without this line and didn't see any warnings.

christopher-wild · 2024-01-03T11:30:32Z

dafni/main_dafni.py

+    """
+    if not variables_path.exists() or variables_path.is_dir():
+
+        raise ValidationError(f"Cannot find a valid settings file at {variables_path.absolute()}.")


I'd argue this could be the built in FileNotFoundError rather than a custom exception.

christopher-wild · 2024-01-03T11:35:41Z

dafni/main_dafni.py

+
+        constraints = set()
+
+        for variable, _inputs in zip(variables, inputs):


Why is the element of inputs called _inputs rather than input? Typically the underscore convention is used for private variables

@cwild-UoS input is already a pre-defined method in Python, so it's not a good idea to overwrite it! I don't think the variable name matters too much in this case, but I can change it if you think it's needed

I think changing it from having an underscore would be good just to give off the right intent. Also isn't it already non pre-defined being inputs rather than input?

christopher-wild · 2024-01-03T11:39:50Z

dafni/main_dafni.py

+
+
+if __name__ == "__main__":
+    main()


For some old fashioned reason (joe heffer explained it once and I forgot), python files should end with a blank line haha

Add: initial commit for the dockerised ctf

1670195

f-allian added the enhancement New feature or request label Dec 12, 2023

f-allian requested review from jmafoster1, rsomers1998 and christopher-wild December 12, 2023 16:59

f-allian self-assigned this Dec 12, 2023

jmafoster1 reviewed Dec 13, 2023

View reviewed changes

rsomers1998 reviewed Jan 3, 2024

View reviewed changes

christopher-wild reviewed Jan 3, 2024

View reviewed changes

f-allian and others added 2 commits January 17, 2024 11:51

Fix: structure and files uploaded to dafni.

bbafb24

Merge branch 'main' into dafni-branch

42d133f

f-allian requested a review from christopher-wild January 17, 2024 13:01

christopher-wild approved these changes Jan 30, 2024

View reviewed changes

f-allian and others added 2 commits January 30, 2024 14:52

Fix: final linting

46216aa

Merge branch 'main' into dafni-branch

5972be0

f-allian merged commit e7a3e90 into main Jan 30, 2024

christopher-wild deleted the dafni-branch branch April 4, 2024 08:25

f-allian linked an issue Apr 9, 2024 that may be closed by this pull request

DAFNI / Docker #92

Closed

		@@ -0,0 +1,210 @@
		import warnings
		warnings.filterwarnings("ignore", message=".The 'nopython' keyword.")


		constraints = set()

		for variable, _inputs in zip(variables, inputs):

Dockerised framework for DAFNI #251

Dockerised framework for DAFNI #251

Conversation

f-allian commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Steps to reproduce tests

1. Without Docker

2. With Docker

Overall Progress

Uh oh!

github-actions bot commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦙 MegaLinter status: ⚠️ WARNING

Uh oh!

codecov bot commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jmafoster1 commented Jan 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christopher-wild commented Jan 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmafoster1 commented Jan 3, 2024

Uh oh!

f-allian commented Jan 3, 2024

Uh oh!

jmafoster1 commented Jan 3, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christopher-wild left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

f-allian commented Dec 12, 2023 •

edited

Loading

github-actions bot commented Dec 12, 2023 •

edited

Loading

codecov bot commented Dec 12, 2023 •

edited

Loading

jmafoster1 commented Jan 3, 2024 •

edited

Loading

christopher-wild commented Jan 3, 2024 •

edited

Loading