Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: unit tests for workflows/tasks/submission #2313

Merged
merged 7 commits into from
Jun 5, 2017
Merged

tests: unit tests for workflows/tasks/submission #2313

merged 7 commits into from
Jun 5, 2017

Conversation

jacquerie
Copy link
Contributor

@jacquerie jacquerie commented May 5, 2017

Addresses #1773

  • add_note_entry
  • close_ticket
  • create_ticket
  • filter_keywords
  • prepare_files
  • prepare_keywords
  • remove_references
  • reply_ticket
  • submit_rt_ticket
  • send_robotupload
  • wait_webcoll

@jacquerie
Copy link
Contributor Author

Bugs uncovered while writing these tests:

@jacquerie
Copy link
Contributor Author

jacquerie commented May 7, 2017

Design problems encountered while writing these tests:

  • This whole thing reeks of spaghetti code: relying on some task to write a certain thing in extra_data so that some task at some later point will pick it up is error-prone and is guaranteed to break in the future when tasks are added/deleted/reordered.

  • These function have too many responsibilities, which results in too many mocks that have to be applied to them while testing. For example reply_ticket contains:

    • some logic* to get the ticket_id on which it should operate;
    • some logic to figure out if the user actually exists;
    • some logic (with fallbacks) to build the ticket body;
    • some logic to fetch an RT instance;
    • finally, some logic to reply to a ticket.

    Note that all these items also have fallback logic to log and bail out if any of these steps fail, effectively duplicating the number of responsibilities of this function.

* Note also that this logic is subtly different from the one in close_ticket.

@kaplun
Copy link
Contributor

kaplun commented May 11, 2017

relying on some task writing a certain thing in extra_data so that some task at some later point will pick it up is error-prone and is guaranteed to break in the future when tasks are added/deleted/reordered.

Given the workflow engine we are employing, what better pattern would you propose?

@jacquerie
Copy link
Contributor Author

Given the workflow engine we are employing, what better pattern would you propose?

Enforce dependencies in code. If task A logically requires task B to be run before it (for example: B is a machine learning algorithm, A persists its results in the record) this must be enforced in code. How? Strong conventions on naming things, ban the use of patterns like extra_data.get(...).

@kaplun
Copy link
Contributor

kaplun commented May 11, 2017

Strong conventions on naming things, ban the use of patterns like extra_data.get(...).

Actually that makes me think that the object passed around doesn't need to be really a dictionary, so we could pass an actual object with dedicated properties/methods that can then enforce the conventions.

Only issue would be serialization in the DB. But shouldn't be impossible, to just serialize as dictionary and reinstantiate back as full-fledged object.

@kaplun
Copy link
Contributor

kaplun commented May 11, 2017

Or more simply, indeed we can have setters and getters that receive the obj as first arguments, allowing for still keeping the obj as a simple dict, but hiding the complexity of writing/accessing special extra keys...

@jacquerie jacquerie self-assigned this May 11, 2017
@david-caro
Copy link
Contributor

ban the use of patterns like extra_data.get(...)

I'm curious, why do you think that doing that (banning the usage of .get) is enforcing in code dependencies between tasks?

@jacquerie
Copy link
Contributor Author

I'm curious, why do you think that doing that (banning the usage of .get) is enforcing in code dependencies between tasks?

To avoid failing silently. The problem is in patterns like

foo = extra_data.get('foo')
if foo:
    ...

because this is not enforcing a dependency on foo, so if the task that populates it starts failing tomorrow we are never going to see it. Things should fail loudly and clearly.

@david-caro
Copy link
Contributor

That can be easily avoided with extra_data['foo'] :)

@jacquerie
Copy link
Contributor Author

That was my point!

@david-caro
Copy link
Contributor

Sorry, I missed that point XS

@jacquerie
Copy link
Contributor Author

...and in fact: https://sentry.cern.ch/inspire-sentry/inspire-labs/group/822209/

Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
Signed-off-by: Jacopo Notarstefano <jacopo.notarstefano@gmail.com>
@jacquerie
Copy link
Contributor Author

Victim of #2385.

@jacquerie jacquerie merged commit 008f171 into inspirehep:master Jun 5, 2017
@jacquerie jacquerie deleted the test-workflows-submission-tasks branch June 5, 2017 23:33
@jacquerie jacquerie removed their assignment Aug 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants