Update load settings and make it the defacto load logic #1921

ThomasHepworth · 2024-02-01T21:15:10Z

Type of PR

BUG
FEAT
MAINT
DOC

Is your Pull Request linked to an existing Issue or Pull Request?

Should aid in simplifying this process, simplifying a lot of the code involved with the settings objects - #1669

Give a brief description for the solution you have provided

This PR consolidates our settings loading logic, moving everything into a singular method irrespective of the input type.

The goal with these changes is to simplify how a settings object interacts with our validation steps and to also simplify any future changes to our settings class.

For some context on the problem, there are three different scenarios in which we handle a "settings object":

The user initialises a linker without any settings - therefore, our settings are set to None
The user initialises a linker with a settings dictionary - we can process this like normal.
The user initialises without and then loads their settings in later.

With (1), we simply need to setup a cache uid to help with storing any outputs (some charts can still be run, even without any settings) and then short-circuit the code (no settings to process).

With (2) and (3), we essentially want to process the settings in the same way, but some of our environment variables may differ. We therefore need to take into account the fact that:

Cache UID needs to be easily overwritten.
SQL dialect needs to be validated each time a settings object is loaded.
The current cache values may need to be removed (a new cache uid may be introduced, which would essentially invalidate the cache anyway).

We were already checking the dialect and invalidating the cache, but I've added some changes to the UID to (fingers crossed) ensure it isn't as fragile. More on this below.

In summary

Why?

Simplifies our validation checking - validation checks now only need to occur in one location and it's easier to ensure that we can trigger checks both before and after settings creation.
Reduces the complexity of our settings loader, make the code more readable and maintainable.

How?

Cache UID is now loaded from the settings dictionary (if it exists) or from the linker's environment. This removes any issues that we might otherwise face with merging the original load_settings and setup_settings_obj functions. More in this previously closed PR.
Our previously fragmented load settings logic (which contained lots of confusing conditionals) has now been consolidated into a single load_settings method.

PR Checklist

Added documentation for changes
Added feature to example notebooks or tutorial (if appropriate)
Added tests (if appropriate)
Updated CHANGELOG.md (if appropriate)
Made changes based off the latest version of Splink
Run the linter

…' into add_comparison_level_validation_check

RobinL · 2024-02-06T14:47:48Z

All looks good to me, once it's pointed at master and the checks are passing, happy to approve

Co-authored-by: Robin Linacre <robin.linacre@digital.justice.gov.uk>

…level_validation_check Add comparison level validation check

ThomasHepworth · 2024-02-06T15:45:45Z

@RobinL see CI/CD checks.

ThomasHepworth added 3 commits January 31, 2024 13:31

Update venv to use a custom name and edit errors

7ac55ee

Clean up how we load settings in the linker

2b1a453

undo changes to exceptions.py

91588b9

ThomasHepworth requested a review from RobinL February 1, 2024 21:15

ThomasHepworth added 4 commits February 1, 2024 21:15

lint with black

0587612

Add a validation check for invalid levels within comparisons

49e0b1e

Test invalid level within a comparison

0dc4cf3

lint & format settings validation code

ddc1328

ThomasHepworth mentioned this pull request Feb 5, 2024

Add comparison level validation check #1926

Merged

10 tasks

ThomasHepworth and others added 3 commits February 5, 2024 11:36

Merge branch 'update_load_settings_and_make_it_the_defacto_load_logic…

6c823af

…' into add_comparison_level_validation_check

lint with black

c35463f

satisfy the linter

84ff619

ThomasHepworth changed the base branch from update_venv_bash_script to master February 6, 2024 15:00

ThomasHepworth and others added 3 commits February 6, 2024 15:31

rm markdown includes

da3b332

Update splink/settings_validation/settings_validation_log_strings.py

70dd549

Co-authored-by: Robin Linacre <robin.linacre@digital.justice.gov.uk>

Merge pull request #1926 from moj-analytical-services/add_comparison_…

670389e

…level_validation_check Add comparison level validation check

RobinL approved these changes Feb 6, 2024

View reviewed changes

ThomasHepworth merged commit 051e4ac into master Feb 6, 2024
10 checks passed

ThomasHepworth deleted the update_load_settings_and_make_it_the_defacto_load_logic branch February 6, 2024 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update load settings and make it the defacto load logic #1921

Update load settings and make it the defacto load logic #1921

ThomasHepworth commented Feb 1, 2024

RobinL commented Feb 6, 2024

ThomasHepworth commented Feb 6, 2024 •

edited

Loading

Update load settings and make it the defacto load logic #1921

Update load settings and make it the defacto load logic #1921

Conversation

ThomasHepworth commented Feb 1, 2024

Type of PR

Is your Pull Request linked to an existing Issue or Pull Request?

Give a brief description for the solution you have provided

In summary

PR Checklist

RobinL commented Feb 6, 2024

ThomasHepworth commented Feb 6, 2024 • edited Loading

ThomasHepworth commented Feb 6, 2024 •

edited

Loading