Skip to content

Fitting KDs with Data Collector

michaelmarty edited this page Dec 22, 2020 · 7 revisions

The Data Collector module was a precursor to MetaUniDec (which now has many of the same functions), but it is still useful for fitting KDs from native MS data and extracting a variety of other data. Here's how you do it:

  1. Deconvolve Spectra With UniDec: I recommend converting all of your data into text files first and saving the text files in the same folder. Then, deconvolve each of these files using UniDec. Make sure you are happy with the deconvolutions. You can use batch processing to quickly work through the files.

  2. Add Files to Data Collector: Open the Data Collector module from the launcher and set the directory of your text files in the box at the top (click the "..." button). Then, click add files and select the files you want to add. You will need to select the top data text files, not anything in the <file_name>_unidecfiles folder. The Data Collector will automatically grab the files it needs from the folder. I recommend clicking and adding the files in the order of increasing ligand concentration to avoid plotting issues.

  3. Set the Concentrations: First, set the protein and ligand concentration for each spectrum. You can specify which charge state you want, but I would highly recommend using all charge states.

  4. Set the Masses: Next, set which mass values correspond to which ligand bound states. Click "Add X Value", set the mass of each peak, and set the number of proteins and ligands that the peak corresponds to.

  5. Set the Extraction Parameters: This tells Data Collector how you want to extract the data and what to extract from. I recommend setting it to "Zero Charge Mass Spectrum" to get the deconvolved data. You can also extract from the raw or processed data, but you will need to adjust your X values in the panel to reflect m/z values rather than masses. You can limit the mass range that is imported to make it easier to plot. For KD fitting, you will want to select normalize data and normalize extraction. At this point, you can click "Run Extraction" and see how it looks.

  6. Adjust the Extraction Type and Window: The default is to extract the peak height at the exact masses that you entered. You will likely want to make it more robust by looking at a local max within a window. Change "Height" to "Local Max" and set a window, which defines the range +/- the entered value that it will look for the local max. But, I would ultimately recommend using "Area" unless the background is really high. Here, the window defines the distance +/- the peak to use as the integration range.

  7. Fit KDs: Make sure the total number of proteins and total number of lipids is set correctly. Leave Number of Bootstraps at 0 for now. Hit "Fit KDs" and see how it looks.

  8. Adjust the Models: There are several different KD models that you can select. For example, you can fix the same KD across all ligand binding events or let it be free. It can fit both ligand binding and protein oligomerization simultaneously. If you only have one protein, it won't matter what you set protein model to be. If you set the maximum number of binding sites, it will determine microscopic KDs. Otherwise, it will use macroscopic KDs. Basically, microscopic KDs are corrected for the total number of available KDs. A more extended discussion is beyond the scope of this Wiki, but look it up because it is interesting. If you are looking at a single protein complex that isn't changing it's oligomeric state, set the number of proteins at 1 and adjust the number of binding sites. For example, with a trimer, set the number of proteins as 1, the concentration as the concentration of the trimer, and set the number of binding sites to 3.

  9. Get Error Bars: After you are happy with the KD model and fits, gradually increase the number of bootstraps to determine the error of the fits. I would recommend starting with 10 before increasing to 100 and then 1000. This will slow things down, but it will provide the errors of the fits. Sometimes, you will get crazy outliers among the bootstraps that need to be removed by selecting "Remove Outliers".

  10. Special Note: With version 4.1, you can fit protein-protein oligomerization KD values in the absence of ligand. The one trick is that you need to set both the protein and ligand concentration to be the protein concentration (future versions will fix this). Then, simply set the x-values in the list to be n proteins and 0 ligands and set the max ligand number to 0. It should fit everything as protein oligomerization.

Hope that helps! It is actually a pretty powerful and general KD fitting module that can be fully scaled to n protein homo-oligomerization steps and m ligand binding steps. Feel free to look more at the UniFit.py source code to see how it works. The source code also allows stranger things like ligand oligomerization, but you need to script this to make it work. As always, let me know if you have any questions.