Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify the data submission form for LCMethod #759

Merged
merged 6 commits into from
Nov 30, 2023
Merged

Conversation

hepcat72
Copy link
Collaborator

@hepcat72 hepcat72 commented Sep 27, 2023

Summary Change Description

The submission form was edited in google docs to request LCMethod details and the documented instructions were changed to reflect that.

I also:

Affected Issues/Pull Requests

Review Notes

Note, I edited the submission form. Please take a look at that.

  • I edited the link in section 1 from tracebase-dev to tracebase
  • I changed the LC Method(s) question to read:

    What liquid chromatography method and run length did you use for each file (e.g. polar-HILIC-25-min, polar-reversed-phase-ion-pairing-25-min, polar-reversed-phase-25-min, lipid-reversed-phase-25-min)?
    If none of these, please include a description (e.g. lipid-reversed-phase-25-min This method uses C18 columns for the analysis of lipids).
    I tried to indicate that the user should use the supplied example values, with run-length, and that if their method differs, they should also include a description.

  • I changed the answer from short answer to long answer (paragraph)

Personally, I feel like this data should be in a sheet in the uploaded excel file so that each method can be associated with the appropriate sample/file, but I tried to stay true to the issue description.

Checklist

This pull request will be merged once the following requirements are met. The
author and/or reviewers should uncheck any unmet requirements:

Files checked in: TraceBaseDocs/TraceBaseDocs/Upload/How to Upload.md TraceBaseDocs/TraceBaseDocs/Upload/How to label Mass Spec Run Information.md
Comment on lines 46 to 47
3. Mass Spectrometry Details (see [How to label Mass Spec Run
Information](https://docs.google.com/document/d/1Lm4br-jCB2QwbgPyzJvvgLgaO-t7Cwcq1xHyBIOlTog/edit?usp=sharing))
3. Mass Spectrometry Details (see [[How to label Mass Spec Run
Information]])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. Removed the google docs link and added a new markdown file to paste its contents in. I have not made any changes to the documentation website.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this internal "link" will work. Since I have not deployed this to the docs website, I don't know how to test that it works.

…ssion form.

Files checked in: TraceBaseDocs/TraceBaseDocs/Upload/How to label Mass Spec Run Information.md
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. This content was copied from the google doc and then edited.

Comment on lines 18 to 29
- Description of Liquid Chromatography method and run length used
- Possible values:
- unknown
- polar-HILIC-25-min
- polar-reversed-phase-ion-pairing-25-min
- polar-reversed-phase-25-min
- lipid-reversed-phase-25-min
- If the method/run-length is not available, please also include a description
- Example Description: lipid-reversed-phase-25-min
This method involves the analysis of lipids using reversed-phase
chromatography on C18 columns, coupled with high resolution mass
spectrometry. It is mainly used for the analysis of lipids.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. This is the section that was edited

Comment on lines +1 to +2
# How to label Mass Spec Run Information

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C. Added a title.

@hepcat72 hepcat72 requested a review from a team September 27, 2023 20:34
@hepcat72 hepcat72 changed the title Modified instructions for the modified submission form. Modify the data submission form for LCMethod Sep 27, 2023
@hepcat72 hepcat72 added the status:inreview This issue/PR is awaiting completion of reviews. label Sep 27, 2023
Copy link
Contributor

@lparsons lparsons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching and fixing the link to tracebase-dev. Also, I agree that this should probably be a new sheet added to the template. The issue #705 states "Update sumbission templates..." so I don't see any conflict in doing that, but please let me know if you have questions or concerns.

@hepcat72
Copy link
Collaborator Author

I agree that this should probably be a new sheet added to the template. The issue #705 states "Update sumbission templates..." so I don't see any conflict in doing that, but please let me know if you have questions or concerns.

The ambiguous/confusing part to me about this is, I wasn't sure it the animal/sample template was included in "submission templates" or if the qualifier "submission" was singling out the sheet associated with the submission form by name and that it was plural because it was "allowing" a separate submission template to be added (separate from the animal/sample sheet) for the newly required information, given the M:M association between the sample and LCMethod / MSRun. I can see you could make a case that that is over-complicating the issue, but given I was struggling with reconciling this question with other issues/concerns, it just wasn't clear to me. So thanks for providing your interpretation. It sounds like you're suggesting that the design here is given a wide berth of interpretations, so I will come up with what I think is the most appropriate solution and possibly revise this and request a followup review.

@hepcat72 hepcat72 marked this pull request as draft September 28, 2023 15:44
@lparsons
Copy link
Contributor

Thanks Rob. I suggest that run your design past the group before committing significant time to it. As we've reviewed before, providing a means for researchers to provide a single lcmethod for each accucor file should cover the majority of cases. When we write the loading code, we can allow for finer grained control (see #706). However, it is best to keep things as simple and straightforward for researchers.

@hepcat72
Copy link
Collaborator Author

Thanks Rob. I suggest that run your design past the group before committing significant time to it. As we've reviewed before, providing a means for researchers to provide a single lcmethod for each accucor file should cover the majority of cases. When we write the loading code, we can allow for finer grained control (see #706). However, it is best to keep things as simple and straightforward for researchers.

I am a step ahead of you on that. I created a design proposal yesterday before I left, for issue #706. There is a lot of overlap between this issue and that one, if you implement the changes called for here in the animal/sample table, since it's the loading code of the accucor data loader that takes in that data. And that's often how I approach design for challenges like this: from the perspective of the code. It informs what I need in terms of the data structure required. Since I was in the mindset of the submission form with this issue, and it didn't seem to fit with the data relationship between LC Method and the submission form, I was having trouble reconciling what seemed like contradictory requirements. Looking at it from the perspective of the loading code helped clarify it for me.

The design I created in issue #706 relates to the changes we discussed yesterday in #687, where I liked the suggestion you made to add an optional sample data header column. That design proposal suggests adding that value as a "name" field to an MSRun record (or equivalent ms-related table - I have to review the schema design to get familiar with the model names you assigned). What that design doesn't explicitly say is that instead of adding that sample data header column to the animal/sample table, I instead propose supplying a separate LCMS metadata file to the accucor loader that associates sample names to sample data headers, along with the other metadata delineated in the issue description (AccuCor file, mzXML filename, Researcher, Date, Instrument, lc method).

This is my proposed alternative to re-supplying the animal/sample table to the accucor data loader. And my reason for that relates to a comment I made that points out that additional data submissions, if we re-supply the animal/sample table, would require either modifying the original sample table associated with a previous submission or duplicate the samples in new submission, neither option of which seemed desirable to me. This alternative design, adding an LCMS metadata file separates the dependency, keeps the animal/sample table clean and uncluttered, decouples the animal/sample table loading code from the accucor code, and meets the data relationship problems without duplicating data or modifying old data.

@hepcat72 hepcat72 added status:onhold This issue is on hold, pending consensus. and removed status:inreview This issue/PR is awaiting completion of reviews. labels Oct 2, 2023
@hepcat72 hepcat72 marked this pull request as ready for review November 6, 2023 14:53
@hepcat72 hepcat72 added status:inreview This issue/PR is awaiting completion of reviews. and removed status:onhold This issue is on hold, pending consensus. labels Nov 20, 2023
@hepcat72
Copy link
Collaborator Author

@lparsons - The dependencies have been merged to main and I updated this branch. Do you want to review it? Your last review was a comment, so I'll wait for the approval before merging.

Copy link
Contributor

@lparsons lparsons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thanks for moving the docs from Google Docs to the markdown files. I'm not sure if you modified the form or not, but it seems like we should probably update the form to take a single value for date, operator, and lc_method and indicate that if multiple are used, the researcher should fill in the details in a separate file. Am I correct in assuming that is how you were planning to have people submit more complicated situations? If so, let's go ahead and merge and then update the form and provide a template.

@hepcat72
Copy link
Collaborator Author

I guess the review was a tad premature. I should have read my affected issues section. I made some changes to the docs based on the changes in #774. Take a look.

I also updated the submission form and I added a template for the LCMS-metadata file.

If you see anything there, let me know.

Files checked in: TraceBaseDocs/TraceBaseDocs/Upload/How to Upload.md TraceBaseDocs/TraceBaseDocs/Upload/How to label Mass Spec Run Information.md TraceBaseDocs/TraceBaseDocs/Upload/Sample Information Sheet.md
Copy link
Contributor

@lparsons lparsons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the lc methods spelled out in the form is great. I can't see the LCMS-metadata file template, the link seems broken: "Sorry, unable to open the file at this time. Please check the address and try again.". A few minor comments, but otherwise, this looks good to go now.

TraceBaseDocs/TraceBaseDocs/Upload/How to Upload.md Outdated Show resolved Hide resolved
Co-authored-by: Lance Parsons <lparsons@princeton.edu>
Files checked in: TraceBaseDocs/TraceBaseDocs/Overview/Navigating TraceBase.md TraceBaseDocs/TraceBaseDocs/Types of Data Output/Fcirc.md TraceBaseDocs/TraceBaseDocs/Types of Data Output/How to Download.md TraceBaseDocs/TraceBaseDocs/Upload/Labeling and Organizing Data.md TraceBaseDocs/TraceBaseDocs/Upload/Sample Information Sheet.md
@hepcat72
Copy link
Collaborator Author

I changed the access of the metadata template to public (like the animal sample template). You should be able to access it now if you're not logged into your google account - I assume that's how you tested it, since it was restricted, but you have edit access.

@hepcat72 hepcat72 merged commit c945545 into main Nov 30, 2023
19 checks passed
@hepcat72 hepcat72 deleted the lcmethod_submission branch November 30, 2023 20:41
@hepcat72
Copy link
Collaborator Author

I went ahead and merged. If you'd like any changes relative to the added template, I'll make a separate PR.

@lparsons
Copy link
Contributor

I changed the access of the metadata template to public (like the animal sample template). You should be able to access it now if you're not logged into your google account - I assume that's how you tested it, since it was restricted, but you have edit access.

I can use the link now. I was logged in before, but perhaps there was an issue since you provided a "copy" link? The template looks clean enough, though a bit intimidating to fill out. Hopefully most researchers won't have to, and we can always tweak things as we go. Thanks for synchronizing the docs, etc.

@hepcat72
Copy link
Collaborator Author

I was thinking before that perhaps having a separate page in the docs that includes an example would help? I'd have added that yesterday, but was eager to move on... I know that you'd found it confusing in the design and asked for an example, which I believe seemed to help...

@lparsons
Copy link
Contributor

Examples can be really useful, but I'd like to keep the instructions as streamlined as simple as possible. Right now we have two major issues to contend with:

  1. TraceBase doesn't strore the mzXML files
  2. Submission of new data is too cumbersome

@hepcat72
Copy link
Collaborator Author

I started working on #710 this afternoon. I've been selecting issues to work on based on the issue-tracking issues (#693, #700, and #701). I don't actually see an issue for storing the mzXML files. Am I just overlooking it? If it doesn't exist, I can create an issue for it, if you like.

Also, I have a pretty well-thought-out plan for streamlining the submission process. I outlined it on pages 2 & 3 of my original proposal. Its groundwork is laid out in:

But I haven't created issues for the rest of it. Essentially, all a user would have to do is submit accucor/isocorr files and once errors are addressed, it would spit out an excel file with the usual animals/samples/treatments/tissues tabs plus tabs for study, infusates, tracers, and the LCMS metadata (which in the proposal, is referred to in 2 sheets: corrections and mzxml sheets) with much of the tedious work done automatically.

They wouldn't have to be new tabs in an excel file either. It could just output tsv files (e.g. compounds, lcms metadata, etc.) like we currently use, though a new tracer/infusate tab would be nice.

But the major benefit would be that it can stub-out all these files with the easy things and the user can just manually fill in the rest (i.e. data that cannot be unambiguously computed).

The one limiting factor is speed. E.g. They couldn't submit an entire study's accucor files. But it could work per handful of accucor files and accumulate the results

@hepcat72
Copy link
Collaborator Author

Ah, I finally found the issue. #753 lays out my submission process proposal. Searching did not reveal it before.

@hepcat72 hepcat72 removed the status:inreview This issue/PR is awaiting completion of reviews. label Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update submission templates to ask for LCMethod information
2 participants