Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore queries based on job id #17

Merged
merged 2 commits into from
Aug 22, 2023
Merged

Restore queries based on job id #17

merged 2 commits into from
Aug 22, 2023

Conversation

gpetretto
Copy link
Contributor

Restoring the option to query based on job_id.

  • for commands where a single id can be passed:
    • the id can either be an integer or a string, it will then be recognized as a db_id or job_id depending on the type (thus the --db-id does not exist anymore)
    • an optional additional argument to specify the index has been added. if omitted the job corresponding to the last index will be used.
    • query based on descending sort on the index
  • for commands where multiple ids can be passed:
    • preserved the difference between job_ids and db_ids. Not sure if it would be easy to handle in case one passes a mix of the two if a single option is used. Could be enforced if it would be better to have a single option like in the previous type of commands
    • when querying based on job_ids the index is mandatory and should be provided in the form UUID:INDEX (e.g. e1d66c4f-81db-4fff-bda2-2bf1d79d5961:2)
    • the query is not an aggregation, using uuid+index to identify each job.
  • when querying to get flows
    • if job_ids are used the index is not need. Since all the jobs with the same uuid but different index will belong to the same flow it would be redundant.
    • passing the index will not work, but there is no check whether the index was actually passed in the format UUID:INDEX.It will just result it will simply not find the job, since there is no job with id with the form e1d66c4f-81db-4fff-bda2-2bf1d79d5961:2. Is it fine like this or should there be such a check?
    • the query is the same as before

Copy link
Member

@davidwaroquiers davidwaroquiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!
I have one or two points I am wondering (but maybe it is not really relevant/needed):

  • what happens in the python interface if you were to pass both a db_id and a uuid ? Maybe this could be checked somewhere so that the user cannot pass both ?
  • we could check whether the uuid part of the UUID:INDEX is indeed a uuid (you are already checking that index is an integer). I proposed a small is_valid_uuid function.
    Anyway these are tiny things. I let you decide if it is worth or not.
    Thanks a lot!

src/jobflow_remote/cli/types.py Show resolved Hide resolved
job_ids_indexes = []
for j in job_ids:
split = j.split(":")
if len(split) != 2 or not split[1].isnumeric():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could also check whether the uuid is correct, e.g. using something like (another implementation could use a regex):

import uuid

def is_valid_uuid(uuid_str):
    try:
        uuid_obj = uuid.UUID(uuid_str)
        return str(uuid_obj) == uuid_str
    except ValueError:
        return False

Is it worth ? What do you think? We could inform the user in case it's not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this could also be checked in the above function if we decide to do so here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but I am not sure if it is worth. We also don't check that the user doesn't pass a negative db_id for example. In general I expect that when using the uuid it will be a copy/paste. The addition is small though, if you feel it will be better I can add it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that the user may copy paste only part of the uuid by mistake (e.g. miss one character at the beginning or end), so maybe informing him that the format is wrong might give him a clue as to copy it again. Anyway, I let you decide whether to add this.

if job_index is None:
sort = [[FW_INDEX_PATH, DESCENDING]]
else:
query[FW_INDEX_PATH] = job_index
if not query:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if both are passed ? Should we prevent that ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a check before in JobController. I removed it but forgot to move it here. I will add it.

src/jobflow_remote/fireworks/launchpad.py Show resolved Hide resolved
@gpetretto
Copy link
Contributor Author

I added what discussed above. I also slipped in the fix for the problem about the --flow-id option that @marcodigennaro pointed out.

Copy link
Member

@davidwaroquiers davidwaroquiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@gpetretto gpetretto merged commit a606cc6 into develop Aug 22, 2023
0 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants