-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYNPY-1357] Allow multiple values in manifest TSV #1030
Conversation
This reverts commit 9317ce4.
Hello @BryanFauble! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2024-01-11 21:05:33 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome start, left some comments.
self.row1 = ( | ||
'%s %s %s "%s;https://www.example.com" provName bar 2020-01-01 2023-12-04T07:00:00Z 2023-12-05 23:37:02.995000+00:00 2023-12-05 07:00:00+00:00\n' | ||
'%s %s %s "%s;https://www.example.com" provName bar 2020-01-01 2023-12-04T07:00:00Z 2023-12-05 23:37:02+00:00 2023-12-05 07:00:00+00:00 a,b,c,d 2020-01-01,2023-12-04T07:00:00.111Z,2023-12-05 23:37:02.333+00:00,2023-12-05 07:00:00+00:00 fAlSe,False,tRuE,True 1,2,3,4 1.2,3.4,5.6,7.8\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test case of a string with escaped commas? What does that look like right now when theres a table value with 'my sentence, comma, has commas'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synapseutils/sync.py
Outdated
""" | ||
values_to_return = [] | ||
|
||
cell_values = re.split(pattern=COMMA_PATTERN, string=cell) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it confusing that setting multiple annotations via a fileview is different from setting multiple annotations via the entity itself?
Future work: do we need to standardize is this way? or is that overkill?
my_file_ent.multiple_annot = "my,multiple,annot"
my_file_ent.annot_with_comma = "my sentence\\,is broken"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding comments here around internal discussions: This implementation is going to require a major version update to the client as those users who might currently be using commas or backslashes within their annotations managed through the Manifest TSV file will need to escape those characters. We will be meeting internally again in January to settle on these changes and propose the path forward for this, possibly tying this with the release to support only the personal access token for authentication and other ground work for future releases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 nice! Going to approve here - we can create a ticket to revisit in January but this is a working solution.
From Milens comment in slack, it does seem like he wants to make it easier for users.
I think what this means is like you suggested, we should create a platform ticket to ask them how often "commas" are in annotations right now and then decide from there what is the least breaking change for all of our users. (Not just schematic)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛸 LGTM! I gave most of my feedback during the first round of PR.
Problem:
syncFromSynapse
function we were only storing the first indexed value for an annotation that had multiple values.syncToSynapse
function we had no way to specify that multiple values are supposed to be added to an Annotation.Solution:
syncFromSynapse
function I am writing a comma delimited list wrapped in brackets for multiple annotation values. This is to match the expected format that works when syncing data to a FileView in Synapse.syncToSynapse
function I am splitting fields by the comma delimited list wrapped in brackets to allow for multiple annotation values to be applied to an individual annotation key.Testing:
This manifest TSV generated the screenshot below: