Metadata intake#87
Conversation
|
Hi @lucasortizny! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
1 similar comment
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
| ), | ||
| **kwargs, | ||
| ) | ||
| # Convert config into a dictionary (but as it is, excludes "common") |
There was a problem hiding this comment.
Is that right? The way ConfigParser works, things that are in common are mirrored to all other sections. What this method is doing (I think) is omitting these mirrored copies (basically if something is in common, and also shows up in another section, just ignore the one in the other section since it's a mirror).
There was a problem hiding this comment.
Oh I absolutely did not know that. I knew default sections in ConfigParsers were special but now it makes sense why this line is here!
There was a problem hiding this comment.
Ok -- can you update the comment to be correct or remove?
There was a problem hiding this comment.
Yeah I'll update it, haven't pushed any commits yet
| return None | ||
|
|
||
| def record_setup(self, description, name, id=None, request=None) -> str: | ||
| def record_setup(self, description, name, extrametadata=None, id=None, request=None, participant_id=None) -> str: |
There was a problem hiding this comment.
can you underscore extra_metadata?
| ### if the metadata does not exist, we are going to log nothing | ||
| else: | ||
| self._db_master_record = self.db.record_setup( | ||
| description="default description", |
There was a problem hiding this comment.
shouldn't this be experiment_name and experiment_description instead of these hardcoded defaults?
There was a problem hiding this comment.
Which line(s) in particular? @mshvartsman is it the else condition where string values are given instead?
There was a problem hiding this comment.
Yeah I'm not sure what I was thinking. If there's no metadata we can keep it to some default string.
| id=experiment_id, | ||
| ) | ||
| ### make a temporary config object to derive parameters because server handles config after table | ||
| tempconfig = Config(**request["message"]) |
There was a problem hiding this comment.
Why not modify self.configure to take in a Config object instead of the raw message? Then you don't need to create it twice (the first thing self.configure does is create a Config anyway).
There was a problem hiding this comment.
Hey @mshvartsman ! I absolutely agree in that it would be a great idea to just pass the config object. One thing I am concerned about is that self.configure is also called by the legacy handle_setup function with a string. I would have to write another version of this function to adjust to the new handle_setup_v01 but again it is important to think about whether you'd like to see another secondary version of an existing function. Please let me know if you'd like me to proceed. (TLDR: I don't want to modify the original handle_setup in any way unless absolutely necessary).
There was a problem hiding this comment.
I think modifying the original handle_setup is maybe okay here. To the extent that there's a contract here it's that legacy handle_setup's behavior doesn't change, which will still be true if you change where the string-to-config conversion happens. I don't think anything else is calling self.configure. @crasanders what do you think?
There was a problem hiding this comment.
Yeah, I think it's fine to change handle_setup to pass a Config into self.configure.
There was a problem hiding this comment.
Okie dokie, will do!
|
|
||
| self._db_master_record = self.db.record_setup( | ||
| description="default description", |
There was a problem hiding this comment.
I don't think these are particularly necessary capitalization / whitespace changes
There was a problem hiding this comment.
I don't have an opinion one way or the other about capitalization, but do make sure that it's consistent. It might be useful to define DEFAULT_NAME and DEFAULT_DESC constants.
|
Great start -- left some nits in the code. Can you put together a minimal example that constructs some messages with metadata, processes them through Also, once #89 lands you should run |
| experiment_name = Column(String(256)) | ||
| experiment_description = Column(String(2048)) | ||
| experiment_id = Column(String(10), unique=True) | ||
| participant_id = Column(String(50), unique=True, nullable=True) |
There was a problem hiding this comment.
Do we want participant_id to be nullable?
There was a problem hiding this comment.
I originally wanted it that way but then I eventually wrote it so a UUID would be generated anyways so this line no longer makes sense so I will remove the nullability!
| id=experiment_id, | ||
| ) | ||
| ### make a temporary config object to derive parameters because server handles config after table | ||
| tempconfig = Config(**request["message"]) |
There was a problem hiding this comment.
Yeah, I think it's fine to change handle_setup to pass a Config into self.configure.
|
|
||
| self._db_master_record = self.db.record_setup( | ||
| description="default description", |
There was a problem hiding this comment.
I don't have an opinion one way or the other about capitalization, but do make sure that it's consistent. It might be useful to define DEFAULT_NAME and DEFAULT_DESC constants.
… remove nullability, modified both handle_setup's for new configure command, changed comment to take into account actual behavior
…sych into metadata-intake
…sych into metadata-intake
This reverts commit 3e6615f.
|
The reason why config was changed to usedconfig is because the metadata-intake functionality introduces a "config" parameter in the configure function in line 677 that I don't want to run into variable naming issues. |
|
@crasanders has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary: Here is my pull request for the metadata intake. Some notes to recap: - The implementation in server.py is in handle_setup_v01. As requested, handle_setup has been left intact. - The master table has two new columns: participant_id (unique, nullable, max size is 50 characters in length, auto-generated UUID4 if not specified) and extra_metadata (4096 characters in length). - The extra metadata information will be serialized/deserialized -> JSON and placed into the Database in that way for future reference. - The Database schematic I originally created will need to be updated with the new information. Please let me know if there are any questions! This should help cover facebookresearch#32 Pull Request resolved: facebookresearch#87 Differential Revision: D38840357 Pulled By: crasanders fbshipit-source-id: 3549bdf887df4f09829cc9088d3a303b29932dfa
Summary: Here is my pull request for the metadata intake. Some notes to recap: - The implementation in server.py is in handle_setup_v01. As requested, handle_setup has been left intact. - The master table has two new columns: participant_id (unique, nullable, max size is 50 characters in length, auto-generated UUID4 if not specified) and extra_metadata (4096 characters in length). - The extra metadata information will be serialized/deserialized -> JSON and placed into the Database in that way for future reference. - The Database schematic I originally created will need to be updated with the new information. Please let me know if there are any questions! This should help cover facebookresearch#32 Pull Request resolved: facebookresearch#87 Differential Revision: D38840357 Pulled By: crasanders fbshipit-source-id: e5e6ce31bca2e959349204afaec99e5d6e6270b1
Summary: Here is my pull request for the metadata intake. Some notes to recap: - The implementation in server.py is in handle_setup_v01. As requested, handle_setup has been left intact. - The master table has two new columns: participant_id (unique, nullable, max size is 50 characters in length, auto-generated UUID4 if not specified) and extra_metadata (4096 characters in length). - The extra metadata information will be serialized/deserialized -> JSON and placed into the Database in that way for future reference. - The Database schematic I originally created will need to be updated with the new information. Please let me know if there are any questions! This should help cover facebookresearch#32 Pull Request resolved: facebookresearch#87 Differential Revision: D38840357 Pulled By: crasanders fbshipit-source-id: 6f07bd944e4184f5e951383f28ded5fff4850e42
Summary: Here is my pull request for the metadata intake. Some notes to recap: - The implementation in server.py is in handle_setup_v01. As requested, handle_setup has been left intact. - The master table has two new columns: participant_id (unique, nullable, max size is 50 characters in length, auto-generated UUID4 if not specified) and extra_metadata (4096 characters in length). - The extra metadata information will be serialized/deserialized -> JSON and placed into the Database in that way for future reference. - The Database schematic I originally created will need to be updated with the new information. Please let me know if there are any questions! This should help cover facebookresearch#32 Pull Request resolved: facebookresearch#87 Test Plan: New unit tests Differential Revision: https://internalfb.com/D38840357 Pulled By: crasanders fbshipit-source-id: 328b4cbc6ccd03a1e9869c70dbf18f875846327d

Here is my pull request for the metadata intake. Some notes to recap:
Please let me know if there are any questions! This should help cover #32