Skip to content

Conversation

@gorskysd
Copy link
Contributor

@gorskysd gorskysd commented Sep 2, 2022

A few things tackled as part of PROD-399, part of a sync_backend ticket to create a single entry point for the spark predictor.

  • Decoupled Extraction from EventLogBuilder to enable extraction as a separate process unrelated to spark eventlogs specifically. This is useful in the spark_predictor.
  • Fixed up the tests for the decoupling.
  • Fixed a bug where the pandas dataframe row with index=0 was selected when the first row after sorting should have been selected. This was achieved by executing .reset_index() after the sort.

@gorskysd gorskysd requested a review from rmoneys September 2, 2022 16:15
@gorskysd gorskysd requested a review from NKSync September 7, 2022 20:15
Copy link
Contributor

@rmoneys rmoneys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I hope you can fix my mistake in the tests.

EventLogBuilder(event_log.as_uri(), temp_dir).build()
with self.assertRaises(ValueError, msg=msg):
event_log_paths = extractor.Extractor(event_log_path.as_uri(), temp_dir).extract()
eventlog.EventLogBuilder(event_log_paths, temp_dir).build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice abstraction. I made an error when I coded the first version of this thinking that assertRaises will assert that the value of its "msg" argument matches that of the attribute of the exception. That's not the case: https://docs.python.org/3/library/unittest.html#unittest.TestCase.assertRaises

I should've done,

with self.assertRaises(ValueError) as cm:
                event_log_paths = extractor.Extractor(event_log_path.as_uri(), temp_dir).extract()
                eventlog.EventLogBuilder(event_log_paths, temp_dir).build()
assert str(cm.exception) == msg, "Exception message matches"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Had to fix some of the error messages to match, too.

btw, curious why you went went unittest here instead of using pytest's exception testing capabilities.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall having made a deliberate choice between the 2, but I'm interested in differences that may exist between the 2 or examples with the pytest alternative. I do like that these related tests are grouped in a class, and that class offers assertRaises... felt right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gorskysd gorskysd requested a review from rmoneys September 9, 2022 12:00
Copy link
Contributor

@rmoneys rmoneys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type annotation looks off, but that can be addressed now or later. Looks good!

self.source_url, self.work_dir, self.s3_client, extract_thresholds
)

def _validate_event_log_paths(self, event_log_paths: Path | str) -> Path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want list[Path] | list[str] and list[Path] here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best to fix in now :)

@sonarqubecloud
Copy link

sonarqubecloud bot commented Sep 9, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@gorskysd gorskysd merged commit de7cbfb into main Sep 9, 2022
@gorskysd gorskysd deleted the bugfix/PROD-399-triggered-prediction-error-on-public-api-production-at-2022-07-29-t-16-07-15-529-z branch September 9, 2022 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants