Skip to content
This repository was archived by the owner on Mar 13, 2020. It is now read-only.

Conversation

@ChintanRaval
Copy link
Collaborator

@ChintanRaval ChintanRaval commented Jan 7, 2019

Adds the below new commands:

  • compare: Compares & persists SHA256-hashed checksums of the given models against those of the last successful execution. Returns comma-separated string of changed model names. Parameters required:
    • execution-id: a GUID identifier of an existing data pipeline execution as returned by the init command.
    • model-type: type of models being processed e.g.: load, transform, etc. this model-type is used to group the model checksums by and used to find and compare older ones.
    • base-path: absolute or relative path to the models e.g.: ./load, /home/local/load, C:/path/to/load
    • model-patterns: path-based patterns (relative to base-path) to different models with extensions. models within a model-type must be named uniquely regardless of their file extension. e.g.: *.txt, **/*.txt, ./relative/path/to/some_models/**/*.csv, relative/path/to/some/more/related/models/**/*.sql
  • complete: Marks the completion of an existing execution by updating a record for the same in the given database. Returns nothing unless there's an error. Parameter required:
    • execution-id: a GUID identifier of an existing data pipeline execution as returned by the init command.

…hash function) upon successful completion of a data pipeline execution
@ChintanRaval ChintanRaval requested a review from a team January 7, 2019 05:19
@ChintanRaval
Copy link
Collaborator Author

ChintanRaval commented Jan 7, 2019

I have tried a few approaches before ending up with the one in this PR but I am not content with what I've done here... so I'll be doing more reading over the best practices before deciding on whether or not to go further with the current approach. 🤔

@elexisvenator
Copy link
Collaborator

No description provided.

???

… transform models upon finishing a data pipeline execution
@ChintanRaval
Copy link
Collaborator Author

No description provided.

???

details already in story on JIRA

@ChintanRaval ChintanRaval requested a review from a team January 17, 2019 23:37
elexisvenator
elexisvenator previously approved these changes Jan 17, 2019
@ChintanRaval ChintanRaval requested a review from a team January 18, 2019 00:15
metavar='execution_id',
help='data pipeline execution id as received using \'start\' command')
finish_command_parser.add_argument('model_folder_paths',
metavar='model-folder-paths',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should model-folder-paths be snake case like execution_id?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.. Good catch.. i mucked around with it a bit too much.. it intrinsically turns it into a variable with snake case but i was trying to be explicit with metavar and shit happened.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turns out the discrepancy was actually in the execution_id and model-folder-paths was right all along.

waltaro
waltaro previously approved these changes Jan 20, 2019
elexisvenator
elexisvenator previously approved these changes Jan 21, 2019
…ist model checksums against the last successful execution (b) update 'finish' command to only mark execution as completed
@ChintanRaval ChintanRaval dismissed stale reviews from elexisvenator and waltaro via dbe8ace January 22, 2019 22:38
(a) rename 'start' command to 'init'
(b) rename 'finish' command to 'complete'
(c) remove 'in progress' status
@ChintanRaval ChintanRaval changed the title [OSC-1137] - add ability to persist model file checksums upon successful completion [OSC-1136, OSC-1137] - add 'compare' and 'complete' command Jan 23, 2019
@ChintanRaval ChintanRaval changed the title [OSC-1136, OSC-1137] - add 'compare' and 'complete' command [OSC-1136, OSC-1137] - add 'compare' and 'complete' commands Jan 23, 2019
@ChintanRaval ChintanRaval requested a review from a team January 23, 2019 01:49
__tablename__ = TABLE_NAME
__table_args__ = {'schema': Constants.DATA_PIPELINE_EXECUTION_SCHEMA_NAME}

id = Column('id',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use PRIMARY_KEY_COL_NAME = 'id' here as well? Similar to DataPipelineExecutionEntity

waltaro
waltaro previously approved these changes Jan 23, 2019
elexisvenator
elexisvenator previously approved these changes Jan 23, 2019
@ChintanRaval ChintanRaval dismissed stale reviews from elexisvenator and waltaro via d7d1a83 January 24, 2019 00:44
@ChintanRaval ChintanRaval merged commit 22b3e5a into master Jan 24, 2019
@ChintanRaval ChintanRaval deleted the feature/OSC-1137-persist-model-checksums branch January 24, 2019 01:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants