Skip to content

Experimental designs for Varying One Factor at a Time (OFAT) Crystallization Expereiments

License

Notifications You must be signed in to change notification settings

MooersLab/ofat4xtals

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

77 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Varying One-Factor-at-a-Time (OFAT) Experimental Designs for Crystal Growth

Version License: MIT

Purpose

This repository contains Excel spreadsheets for applying OFAT Designs (OFATs) to the size optimization of crystals of biological macromolecules (proteins and nucleic acids). See the list below of features. These include several automated steps that ease the use of these spreadsheets and accelerate the analysis of results.

These designs are developed to be deployed in 4x6 24-well crystallization plates. These spreadsheets can be adapted to other biochemical experiments as explained below. They can also be adapted to stochastic computer experiments.

Crystal size is generally proportional to its diffraction power, provided the crystal is not internally disordered and properly cryoprotected. Crystallographers seek large crystals to obtain high-resolution data for high-quality structures.

The growth of large protein crystals is highly reproducible when near the optimal conditions. You know that you have found the optimal conditions when replica drops return a single large crystal per drop. With these optimal conditions, you can generate hundreds of large crystals to find favorable cryo conditions and for ligand and heavy atom soaks. One of the strengths of protein crystallography as an experimental system is the highly reproducible nature of crystallization results.

Workers who report difficulty reproducing crystals likely found the crystals on the steep slope of the response surface below the peak of optimal conditions. On the steep slope of the response surface, the factors very rapidly and are difficult to reproduce. In contrast, the gradients in the factors are close to zero at the peak which explains why the peak conditions are reproducible.

Features to ease the use of the design

Autogenerated plots of the results against the factor levels accelerate the identification of the active factors. The user enters the length of the longest crystal per drop (if any crystals are present) and the crystal score, and then the plots automatically update. Placeholder values in the column represent common scenarios.

  • Treatments for one factor are listed in a column rather than across columns to ease the analysis of the results
  • Support for randomization of the treatments within a factor.
  • User input is specified in the light gray-shaded cells.
  • Automated updating of all the required volumes.
  • The difference between the expected final volume and the sum of the non-water components to check to ensure that it is greater than 0. If the sum is negative, the cell will turn red. The total volume of the aliquots for one crystallization solution is checked to ensure that it sums to the target volume. The cell will turn red if the sum does not equal the target value.
  • The user is free to specify the total volume, this is not hardwired.
  • The total volume of aliquots for each stock solution is summed across all 24 wells so that the user knows the volume of stock solution they will need for each required stock solution. They can compare these volumes to the available stocks to determine if they need to make new solutions. They should use only the latest solutions rather than stock solutions prepared at different times because this could be a source of experimental variability. After randomization, the table of the treatments assigned to the crystallization tray can also be used to record observations of the crystallization drops.
  • A crystal scorecard is included for convenience.

What are OFAT experiments?

These OFAT experiments involve holding all factors constant except for one while it is varied over several factor levels.

Advantages of OFAT experiments

  • Requires less mental effort is required to interpret the results
  • Not guaranteed to find the optimum via use in serial experiments to search for the optimum.

Disadvantage of OFAT experiments

  • Cannot detect factors that interact.
  • Require more samples for the same level of precision compared to more advanced experimental designs.
  • Experimental error must be smaller than the factor effects.

We recommend starting the variation at 0 to test whether a factor's presence or absence influences crystal growth. Here, we test four levels of each factor, including the zero factor level, so that the three points with the factor present can be used to assess whether the response is linear or curvilinear.

Most factors are expected to have a concave curvilinear effect on the crystallization of biological molecules. Convex curvilinear effects may occur occasionally. The effect might appear linear if the range of the tested factor levels is narrow. In this latter case, you may be exploring only the rising or falling shoulder of a quadratic response.

Four levels is a minimal approach to detecting the curvilinear response. Five or six levels might be a more robust approach to characterizing the curvilinear nature of the response. However, the more significant number of factor levels rapidly increases the size and expense of the experiment.

This is about all you can expect out of these kinds of experiments. You may propagate your lead when testing factors that do not influence crystallization. This might be a side benefit if the crystallization conditions are close to the optimum.

These OFAT experiments are a horrible way to find the optimal conditions when serially applied. They should be limited to exploring a factor's nature regarding whether its presence has an effect and detecting a curvilinear response. This information can then be used to select the factors and their levels in full factorial, incomplete factorial, or optimal experimental designs. These latter designs can detect at least two-way interactions in addition to quadratic responses.

Replication and randomization

The typical practice is to avoid creating replica crystallization experiments to save time and expensive samples. That is, the standard practice is to do exploratory trials rather than statistically valid experiments that return an assessment of the variation.

Ordered:

HTML5 Icon

Randomized:

HTML5 Icon

The treatments within a factor should be randomly assigned to break correlations caused by systematic errors. Each of those experiments is treated as an independent experiment, so there is no randomization across the experiments. This latter feature eases the setting up of the crystallization solutions within an OFAT experiment because only the volume of one factor and the volume of the water vary.

The protocol for re-randomization of the treatments in an experiment with no replica is as follows:

  • Select cells C6 to P11
  • Go to sort pulldown
  • Select
  • Sort on column.
  • Repeat by selecting C11 to P14.
  • Custom sort on column C.
  • Repeat by s -rw-r--r-- 1 blaine staff 1114 Jan 15 07:15 LICENSEelecting C15 to P18. Custom sort column C and so on.

To help interpret the results, see the autogenerated plots in the worksheet. Do NOT undo the randomization.

Liquid handling robot

While you might be tempted to feed these designs into a liquid-handling robot, you should look into the robots' coefficients of variation (CV). It may be much higher than you expected. One popular liquid-handling robot has a CV of 5%; I was expecting a CV of 1%. Your manual pipetting CV may be less than 2% if you minimize evaporative losses while dispensing the stock solution. In addition, it can take about 12-20 hours of tedious editing of an input CSV file to read one of these designs by the liquid handling robot.

Using a liquid handling robot once the csv file is in working order and can be used repeatedly after minor modification for each experimental design might yield some efficiencies. However, for small volumes, one has to worry about evaporative losses during the assembly of the solutions. These losses may be the primary sources of variation in the final volumes rather than pipetting errors.

Customizing the design for your experiment

It takes about 3 minutes to edit a spreadsheet to customize it for a new crystallization experiment. The cells shaded light gray require editing.

We were setting up 100 microliter volumes using the mini-screen approach, which is popular in nucleic acid crystallography. In this approach, the crystallization solution is mixed with the sample stock solution but not put in the reservoir, unlike the standard practice in protein crystallography. This approach minimizes using expensive and toxic chemicals in the crystallization solution. This is a greener approach to crystallization experiments.

This approach also allows you to reuse the crystallization tray because you are not withdrawing components from the reservoir solution that may not be RNase-free despite your efforts to clean the reservoir between uses. The recycling of crystallization plates saves money and reduces plastic waste.

If you make the reservoir solutions according to the standard protein crystallization practice, you can change the volume to 1,000 microliters. You can use 350 microliters, which is sufficient to cover the bottom of a standard 24-well crystallization plate. The surface area of the reservoir solution matters more than its volume. If you want to make stock solutions for 15 mL or 50 mL Falcon tubes, you could change the volume from 100 microliters to 15,000 or 50,000 microliters.

HTML5 Icon

The spreadsheet makes it easy to apply OFATs in laboratory experiments. These spreadsheets could also be adapted to other laboratory and field experiments. If you need a design with a run configuration different from a 4 x 6 array, please post an issue.

Codings and treatments

A design matrix with the codings of the factor levels is applied to the user-entered concentration ranges. This matrix has five continuous factors and one categorical factor. In this case, the categorical factor is the buffer's identity. The buffer identity has to be entered by hand in column AR.

The reservoir's crystallization and precipitant solution are independent in a mini-screen setup. You can set up a factor that is the concentration of the precipitating agent in the reservoir solution. This is often an active factor with a quadratic effect on crystal size. This factor should always be varied. In the case shown here, we can make a second copy of the worksheet and vary the precipitant concentration in column 1 of a second tray.

What if you have fewer than six chemical factors?

If the solution for your crystallization lead has fewer than six chemical components and you use the mini-screen approach, you can use one of the unneeded columns to vary the concentration of the precipitating agent in the reservoir. You can also use the unused columns for testing physical parameters like drop size and sample volume ratio to crystallization solution volume.

you can also use an unused column for the purpose of repeating one of the factors. this partial replication will provide some measure of the variability in your experiment. it would be better to do replica experiments. you can set up a replica tray with a different randomization of the treatments within each column.

A more rigorous approach would be to have the replica columns adjacent to each other and randomize the treatments within this column and row block. We are working on spreadsheets that will support the presence of one, two, or three replicas, with the replica columns grouped together.

HTML5 Icon

Autogenerated Recipes

The recipes for each crystallization solution are automatically generated for continuously varying factors based on the user-supplied input. The recipes for categorical factors, like buffer identity, must be entered by hand. The column on the left below is for a continuous factor. The column on the right is for a categorical factor.

HTML5 Icon

Pairs of columns can be selected in Excel and fit to the page for printing. One 6 x 4 experiment will require three laboratory notebook pages.

Scoring the results

We score of the longest crystal in the crystallization drop because this is using the crystal that will be mounted in a cryo loop for X-ray diffraction studies. Unless we take extra measures, the remaining crystals in the drop may be lost due to disruption of the crystallization drop and thus be unavailable for diffraction studies. Before crystal's appear long enough to be measured, we can still score the drop with the integers in the scoresheet below. The scores relate to moving towards conditions that give diffraction-quality crystals.

HTML5 Icon

Autogenerated plots

The most common approach to analyzing the crystallization results is to scan the drops to find diffraction-quality crystal(s). This is routinely done to find the leads for follow-up experiments. When such leads do not exist, the most efficient thing to do is more crystallization screens or improve the purification protocol of the sample.

Another option requiring time is to conduct a more rigorous analysis of the crystallization results. In this approach, each of the crystallization drops receives a score, and if crystals larger than 20 micrometers are observed, then the longest crystal in the drop has its length recorded. Otherwise, a value of 0 is entered. These two measures of crystallization success are recorded in a laboratory notebook. Recording the scores and links will take about 10-30 minutes per 24-well plate, depending on the experience of the person doing the scoring.

These scores can then be transferred to the two rightmost columns in the table in the upper left-hand corner of the spreadsheet. After doing so, the user does not have to do any more work because the inserted values are plotted in 12 plots in the spreadsheet's bottom left-hand corner. The user must only look at these plots to identify which factor affects the outcome. The user can also assess the nature of the effect: is it linear and positively correlated with factor level, is it negatively correlated with factor level, or is the response quadratic?

Results that represent routine outcomes have been added to the two columns to generate plots corresponding to everyday scenarios. The user simply replaces these placeholder values with values from their own experiments. These plots will save the user hours of generating their own plots in Excel or with an external program (e.g., the ggplot2 library in R or matplotlib in Python). The table and the plots are linked by code such that only the results from a particular factor are plotted regardless of whether the treatments are ordered in the table or randomized. The user does NOT need to undo the randomization.

Factor has no effect:

The scenario below shows a situation where a factor does not influence the response. This factor can be eliminated from further consideration in future experiments. The Y axis for the length data was auto-scaled, so they accentuate the variation between the lengths when that variation is essentially zero. The Y-axis for the crystal scores did not auto-scale; the scores show a flat line. The plot of the scores provides a reality check on the plot of the crystal lengths.

HTML5 Icon

Response negatively correlated with factor level

This pair of plots demonstrates the scenario where there is a decrease in the response with increasing factor level.

HTML5 Icon

U-shaped response

This pair of plots demonstrates a U-shaped response to the increasing factor level. This kind of response reflects a salting-out and then salting-in effect.

HTML5 Icon

Positively correlated linear response

This pair of plots shows a response that increases linearly with increasing factor levels. The factor levels may be from the rising shoulder of a quadratic response curve.

HTML5 Icon

Quadratic response

This pair of plots shows a quadratic response. The optimal response is near the peak of this curve for this factor. Most factors are expected to get this kind of response.

HTML5 Icon

No lengths to measure

This pair of plots shows responses when the crystals are not long enough to be measured. This comparison shows the value of plotting the crystal scores because the response positively correlates with the rising factor level. These results can lead to the design of a better crystallization experiment.

HTML5 Icon

Printing the design

Invariably, you will want to print and paste the worksheet into a laboratory notebook. The printing of all the components of the spreadsheet requires the setting of several print areas within Excel. Completing this task takes 10 to 20 minutes, so allow sufficient time.

This is the protocol that we use:

  • Select the table or tables that you want to print on a page
  • Under the file pulled down select Print Area/Set Print Area
  • Select print. in the print dialog, select portrait mode and scale to page
  • Click on the blue Print button after selecting the appropriate printer
  • Under the file pulled down, select Print Area/Clear Print Area
  • Repeat the cycle.

The workbook can be printed from other software that can read Excel files (e.g., Libre Office, Open Office, Numbers), following a similar protocol.

Use outside of crystallization experiments

These experimental designs can be adapted to other biochemical experiments involving aqueous solutions where water is the dominant molar component. When water is not the dominant component in a mixture, experimental designs for mixtures must be used.

One could also envision adapting these experimental designs to computer experiments where the results are stochastic.

Update History

Version Changes Date
Version 0.1 Initiated. 2024 April 1
Version 0.2 Working version 2024 April 2
Version 0.3 Version with plots 2024 April 3
Version 0.4 Fixed leftmost column of the design matrix. 2024 April 5

Sources of funding

  • NIH: R01 CA242845
  • NIH: R01 AI088011
  • NIH: P30 CA225520 (PI: R. Mannel)
  • NIH P20GM103640 and P30GM145423 (PI: A. West)

About

Experimental designs for Varying One Factor at a Time (OFAT) Crystallization Expereiments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published