Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field method ADR #174

Merged
merged 3 commits into from
May 15, 2024
Merged

Field method ADR #174

merged 3 commits into from
May 15, 2024

Conversation

ehanson8
Copy link
Contributor

Purpose and background context

ADR for discussion of a field method refactor

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed and verified
  • New dependencies are appropriate or there were no changes

Why these changes are being introduced:
* Discussion is needed on a proposed field method refactor

How this addresses that need:
* Add ADR for proposed field method refactor

Side effects of this change:
* None

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/TIMX-275
Copy link
Contributor

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments, but overall, I think it's great. Succinct, direct, understandable. I think the "Consequences" section is particularly good, explaining the value add from this work in an easily digestable way.

As commented, I think it might benefit from a brief explanation of a "field method" early on, to solidify that term and understanding throughout.

ADR text, approved. And actual ADR decision, I'm in favor!

docs/adrs/0005-field-methods.md Outdated Show resolved Hide resolved

Currently much of the transformation logic for each source is found in the `get_optional_fields` method which extracts data for most of the fields. This results in very large methods as well as inefficient orchestration and testing.

A field method approach would address many of these complications. All of the required fields have dedicated field methods as well as some of the complex fields that are called by `get_optional_fields`. This refactor can be seen as an extension of that approach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I understand what a "field method" is, I think that term may need to be introduced here if it's going to be used.

That way, it also makes sense exactly what it means here,

All of the required fields have dedicated field methods...

Though I think the Decision and Consequences section do a good job of defining what a "field method" is, I believe it should be up here as well (or at least first), to help with this section.

Copy link
Contributor Author

@ehanson8 ehanson8 May 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion, let me know if this is a good definition



## Consequences
Field methods should ultimately simplfy the source transforms, and the overall application, by providing an easily repeatable structure with method docstrings offering additional context for the more complicated logic required for some fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, concise, accurate, and I agree!

## Consequences
Field methods should ultimately simplfy the source transforms, and the overall application, by providing an easily repeatable structure with method docstrings offering additional context for the more complicated logic required for some fields.

Testing should also be significantly improved. The current testing suite is built around very large fixtures and unit tests meant to encompass the expected data scenarios for each source's records. They are not easy to parse and can be a barrier to new developers trying to understand the application. Separate tests for each field method should simplfy the testing infrastructure and make it easier to maintain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - again, this is well stated.


Testing should also be significantly improved. The current testing suite is built around very large fixtures and unit tests meant to encompass the expected data scenarios for each source's records. They are not easy to parse and can be a barrier to new developers trying to understand the application. Separate tests for each field method should simplfy the testing infrastructure and make it easier to maintain.

Logging and associated analytics should also be easier to manage and provide a better view of the data transformed by this application.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the focus of analytics and metrics in current strategic plans, and TIMDEX roadmapping, I think it could be helpful to provide an example here, even if minimal.

One that comes to mind is easily (because it is technically possible before a refactor) surfacing metrics about particular fields. Ignoring for the moment where the metrics are sent, it now seems possible to wrap the "field methods" during orchestration of the transform, thereby allowing for a consistent and shared way of reporting errors or outlier behavior when transforming fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am stealing some of this language : )

@ehanson8
Copy link
Contributor Author

ehanson8 commented May 15, 2024

It is not clear why a coveralls error is suddenly tanking CI given that I only updated the ADR but I've made some updates in response to @ghukill's suggestions. Edit: Error resolved after a few runs, likely a connection issue

@ghukill
Copy link
Contributor

ghukill commented May 15, 2024

It is not clear why a coveralls error is suddenly tanking CI given that I only updated the ADR but I've made some updates in response to @ghukill's suggestions. Edit: Error resolved after a few runs, likely a connection issue

This happened to me the other day as well; just intermittent coveralls hiccups it seems.

Copy link
Contributor

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Looking forward to our discussion.


Currently much of the transformation logic for each source is found in the `get_optional_fields` method which extracts data for all optional fields. This results in very large methods as well as inefficient orchestration and testing.

A field method approach would address many of these complications, where each `TimdexRecord` field has a corresponding method containing all of the logic related to extracting and formatting the values for that field. All of the required fields have dedicated field methods as well as some of the complex fields that are called by `get_optional_fields`. This refactor can be seen as an extension of that approach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dig it. Thanks for explaining what a "field method" is here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! Awesome description.


Testing should also be significantly improved. The current testing suite is built around very large fixtures and unit tests meant to encompass the expected data scenarios for each source's records. They are not easy to parse and can be a barrier to new developers trying to understand the application. Separate tests for each field method should simplfy the testing infrastructure and make it easier to maintain.

Logging and associated analytics should also be easier to manage and provide a better view of the data transformed by this application. During orchestration of the transform, the field methods will provide a consistent and shared way of reporting errors or outlier behavior when transforming fields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@jonavellecuerdo jonavellecuerdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really well-written ✨


Currently much of the transformation logic for each source is found in the `get_optional_fields` method which extracts data for all optional fields. This results in very large methods as well as inefficient orchestration and testing.

A field method approach would address many of these complications, where each `TimdexRecord` field has a corresponding method containing all of the logic related to extracting and formatting the values for that field. All of the required fields have dedicated field methods as well as some of the complex fields that are called by `get_optional_fields`. This refactor can be seen as an extension of that approach.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! Awesome description.

@ehanson8 ehanson8 merged commit d32b6b6 into main May 15, 2024
5 checks passed
@ehanson8 ehanson8 deleted the field-method-adr branch May 15, 2024 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants