Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update load settings and make it the defacto load logic #1921

Merged

Conversation

ThomasHepworth
Copy link
Contributor

Type of PR

  • BUG
  • FEAT
  • MAINT
  • DOC

Is your Pull Request linked to an existing Issue or Pull Request?

Should aid in simplifying this process, simplifying a lot of the code involved with the settings objects - #1669

Give a brief description for the solution you have provided

This PR consolidates our settings loading logic, moving everything into a singular method irrespective of the input type.

The goal with these changes is to simplify how a settings object interacts with our validation steps and to also simplify any future changes to our settings class.

For some context on the problem, there are three different scenarios in which we handle a "settings object":

  1. The user initialises a linker without any settings - therefore, our settings are set to None
  2. The user initialises a linker with a settings dictionary - we can process this like normal.
  3. The user initialises without and then loads their settings in later.

With (1), we simply need to setup a cache uid to help with storing any outputs (some charts can still be run, even without any settings) and then short-circuit the code (no settings to process).

With (2) and (3), we essentially want to process the settings in the same way, but some of our environment variables may differ. We therefore need to take into account the fact that:

  • Cache UID needs to be easily overwritten.
  • SQL dialect needs to be validated each time a settings object is loaded.
  • The current cache values may need to be removed (a new cache uid may be introduced, which would essentially invalidate the cache anyway).

We were already checking the dialect and invalidating the cache, but I've added some changes to the UID to (fingers crossed) ensure it isn't as fragile. More on this below.


In summary

Why?

  • Simplifies our validation checking - validation checks now only need to occur in one location and it's easier to ensure that we can trigger checks both before and after settings creation.
  • Reduces the complexity of our settings loader, make the code more readable and maintainable.

How?

  • Cache UID is now loaded from the settings dictionary (if it exists) or from the linker's environment. This removes any issues that we might otherwise face with merging the original load_settings and setup_settings_obj functions. More in this previously closed PR.
  • Our previously fragmented load settings logic (which contained lots of confusing conditionals) has now been consolidated into a single load_settings method.

PR Checklist

  • Added documentation for changes
  • Added feature to example notebooks or tutorial (if appropriate)
  • Added tests (if appropriate)
  • Updated CHANGELOG.md (if appropriate)
  • Made changes based off the latest version of Splink
  • Run the linter

@RobinL
Copy link
Member

RobinL commented Feb 6, 2024

All looks good to me, once it's pointed at master and the checks are passing, happy to approve

@ThomasHepworth ThomasHepworth changed the base branch from update_venv_bash_script to master February 6, 2024 15:00
ThomasHepworth and others added 3 commits February 6, 2024 15:31
Co-authored-by: Robin Linacre <robin.linacre@digital.justice.gov.uk>
…level_validation_check

Add comparison level validation check
@ThomasHepworth
Copy link
Contributor Author

ThomasHepworth commented Feb 6, 2024

@RobinL see CI/CD checks.

@ThomasHepworth ThomasHepworth merged commit 051e4ac into master Feb 6, 2024
10 checks passed
@ThomasHepworth ThomasHepworth deleted the update_load_settings_and_make_it_the_defacto_load_logic branch February 6, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants