Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM issue faced during Searchspace: Required 338 TB Memory #365

Closed
khoonun opened this issue Sep 4, 2024 · 14 comments
Closed

OOM issue faced during Searchspace: Required 338 TB Memory #365

khoonun opened this issue Sep 4, 2024 · 14 comments

Comments

@khoonun
Copy link

khoonun commented Sep 4, 2024

**Hi,

Ideally, i should use at least a resolution of 4.
However, even with resolution of 2, I am running into the OOM issue with the below code.
Based on the res1.exp_rep_human_readable, I am supposed to have roughly 338 TB of memory space for the search space generation.
Basically, I have 5 types of chemicals to select, and within each type of chemical, i have a variety of chemical ingredients to choose from. For instance, X10, I will use 1 of the 3 possible candidates, while X20, I will use 2 of the 3 possible candidates and etc.

I understand that Bayesian optimization suffers from the combinatorial explosion of the search space.
And I have already reduced the search space significantly.

Will appreciate if there is any advice on ways to tackle this issue.
So far, I can have at most 128 GiB of memory to handle this operation.**


RESOLUTION1 = 2 

X10 = {
        "A": "C",
           "B": "CC",
           "C": "CCC",
}
X_11 = SubstanceParameter(name="X11", data=X10, encoding="MORDRED")

X20 = {
            "D": "CCCC",
            "E": "CCCCC",
            "F": "CCCCCC",
}
X_21 = SubstanceParameter(name="X21", data=X20, encoding="MORDRED")
X_22 = SubstanceParameter(name="X22", data=X20, encoding="MORDRED")

X30 = {
            "G": "CCCCCCC",
            "H": "CCCCCCCC", 
            "I": "CCCCCCCCC",
            "J": "CCCCCCCCCC",
            "K": "CCCCCCCCCCC",
            "L": "CCCCCCCCCCCC",
}
X_31  = SubstanceParameter(name="X31", data=X30, encoding="MORDRED")
X_32 = SubstanceParameter(name="X32", data=X30, encoding="MORDRED")
X_33 = SubstanceParameter(name="X33", data=X30, encoding="MORDRED")

X40 = {
            "M": "CCCCCCCCCCCCC",
            "N": "CCCCCCCCCCCCCC",
}
X_41 = SubstanceParameter(name="X41", data=X40, encoding="MORDRED")

X50 = {
            "O": "CCCCCCCCCCCCCC",
            "P": "CCCCCCCCCCCCCCC",
            "Q": "CCCCCCCCCCCCCCCC",
            "R": "CCCCCCCCCCCCCCCCC",
            "S": "CCCCCCCCCCCCCCCCCC",
}
X_51 = SubstanceParameter(name="X51", data=X50, encoding="MORDRED")
X_52 = SubstanceParameter(name="X52", data=X50, encoding="MORDRED")
X_53 = SubstanceParameter(name="X53", data=X50, encoding="MORDRED")
X_54 = SubstanceParameter(name="X54", data=X50, encoding="MORDRED")
X_55 = SubstanceParameter(name="X55", data=X50, encoding="MORDRED")


###################################################################

X11_Equiv = NumericalDiscreteParameter(
    name="X1_1_Equiv", values=list(np.linspace(0, 22.5, RESOLUTION1)), 
    tolerance=0.2
)

X21_Equiv = NumericalDiscreteParameter(
    name="X2_1_Equiv", values=list(np.linspace(0, 5.6, RESOLUTION1)), 
    tolerance=0.2
)
X22_Equiv = NumericalDiscreteParameter(
    name="X2_2_Equiv", values=list(np.linspace(0, 13, RESOLUTION1)), 
    tolerance=0.2
)

X31_Equiv = NumericalDiscreteParameter(
    name="X3_1_Equiv", values=list(np.linspace(0, 3.5, RESOLUTION1)), 
    tolerance=0.2
)
X32_Equiv = NumericalDiscreteParameter(
    name="X3_2_Equiv", values=list(np.linspace(0, 6.5, RESOLUTION1)), 
    tolerance=0.2
)
X33_Equiv = NumericalDiscreteParameter(
    name="X3_3_Equiv", values=list(np.linspace(0, 3.5, RESOLUTION1)), 
    tolerance=0.2
)

X41_Equiv = NumericalDiscreteParameter(
    name="X4_1_Equiv", values=list(np.linspace(0, 5.5, RESOLUTION1)), 
    tolerance=0.2
)

X51_Equiv = NumericalDiscreteParameter(
    name="X5_1_Equiv", values=list(np.linspace(0, 4.5, RESOLUTION1)), 
    tolerance=0.2
)

X52_Equiv = NumericalDiscreteParameter(
    name="X5_2_Equiv", values=list(np.linspace(0, 9, RESOLUTION1)), 
    tolerance=0.2
)

X53_Equiv = NumericalDiscreteParameter(
    name="X5_3_Equiv", values=list(np.linspace(0, 5, RESOLUTION1)), 
    tolerance=0.2
)

X54_Equiv = NumericalDiscreteParameter(
    name="X5_4_Equiv", values=list(np.linspace(0, 9, RESOLUTION1)), 
    tolerance=0.2
)

X55_Equiv = NumericalDiscreteParameter(
    name="X5_5_Equiv", values=list(np.linspace(0, 6, RESOLUTION1)), 
    tolerance=0.2
)

###################################################################
parameters = [
    X_11, X11_Equiv, 

    X_21, X21_Equiv, 
    X_22, X22_Equiv,
    
    X_31, X31_Equiv, 
    X_32, X31_Equiv,
    X_33, X31_Equiv,

    X_41, X41_Equiv,

    X_51, X51_Equiv, 
    X_52, X52_Equiv,
    X_53, X53_Equiv,
    X_54, X54_Equiv,
    X_55, X55_Equiv,
]

###################################################################

perm_inv_constraint = DiscretePermutationInvarianceConstraint(
    parameters=[
                "X11", 
                "X21", "X22",
                "X31", "X32","X33", 
                "X41",
                "X51", "X52","X53", "X54", "X55",
               ],
    dependencies=DiscreteDependenciesConstraint(
        parameters=[
                "X1_1_Equiv",   
                "X2_1_Equiv", "X2_2_Equiv",
                "X3_1_Equiv", "X3_2_Equiv","X3_3_Equiv", 
                "X4_1_Equiv", 
                "X5_1_Equiv", "X5_2_Equiv","X5_3_Equiv", "X5_4_Equiv", "X5_5_Equiv",
               ],
        conditions=[
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
            ThresholdCondition(threshold=0.0, operator=">"),
        ],
        affected_parameters=[
            ["X11"], 
            ["X21"], ["X22"],
            ["X31"], ["X32"],["X33"], 
            ["X41"], 
            ["X51"], ["X52"],["X53"], ["X54"], ["X55"],
        ],
    ),
)

###################################################################

sum_constraint = DiscreteSumConstraint(
    parameters=[
                "X1_1_Equiv",   
                "X2_1_Equiv", "X2_2_Equiv",
                "X3_1_Equiv", "X3_2_Equiv","X3_3_Equiv", 
                "X4_1_Equiv", 
                "X5_1_Equiv", "X5_2_Equiv","X5_3_Equiv", "X5_4_Equiv", "X5_5_Equiv",
               ],
    condition=ThresholdCondition(threshold=40, operator="=", tolerance=SUM_TOLERANCE),
)

###################################################################

no_duplicates_constraint = DiscreteNoLabelDuplicatesConstraint(
    parameters=[
                "X11", 
                "X21", "X22",
                "X31", "X32","X33", 
                "X41",
                "X51", "X52","X53", "X54", "X55",
               ],
)

###################################################################

constraints = [perm_inv_constraint, sum_constraint, no_duplicates_constraint]

res1 = SearchSpace.estimate_product_space_size(parameters)
res1 

res1.exp_rep_human_readable
@AVHopp
Copy link
Collaborator

AVHopp commented Sep 4, 2024

Could you share some details on what exactly the Equiv parameters are supposed to do? Since there are 12 of those, the higher the resolution is, the number of values of those parameters quickly explodes.
Also, can you elaborate on your constraints? I think I have an idea what you want to achieve with your individual constraints but in order to avoid biasing you with my interpretation, it would be better if you could first provide some information on what exactly you want to achieve with each individual constraint.
Reason I am asking that depending on what you want to do with the constraints, you might be able to make Equiv paramers continuous, avoiding the combinatorial explosion.

@AVHopp
Copy link
Collaborator

AVHopp commented Sep 4, 2024

Also, I tried to execute your example and found several errors (duplicate names in lists and so on). Please double-check that your code can directly be executed by just copy-pasting it.

@AdrianSosic
Copy link
Collaborator

Hi @khoonun, can confirm what @AVHopp wrote. We'd be glad to help but the minimum input we need to expect from you is a copy-paste-able example to reproduce the problem. Right now, when I copy the code as is, I get Syntax errors.

@Scienfitz
Copy link
Collaborator

Scienfitz commented Sep 4, 2024

@khoonun Independent of the issues already raise by the others:
The space you seemingly want to build is unsurprisingly huge. There are a lot of parameters and choices - you already mentioned combinatorial explosion.

We have a more memory-efficient implementation of the search space creation including the constraint application. It is automatically enabled if you install baybe with polars depdendency as in pip install baybe[chem,polars]

But I doubt that this will work for the huge space you are trying to build.

There exists no solution to this problem, if your space is too large, it is too large. So the only options you have imo are

  • Reduce the number of parameters and their possible choices
  • Move from a slot-based representation (the one you model here where we have slots corresponding to chemical substance choices and slot-amounts corresponding to the amounts of the slot) to a traditional representation. The traditional representation can be done via continuous parameters which will make the search space very small (it might still be a heavy computation during optimization + doesnt allow for chemical encodings as there are no labels)

I should also mention that the memory number estimated by the utility is the upper limit of memory needed because it does not include the effect of constraints. So it might be that the actual memory you require is <1% of the number shown. I know this is unsatisfactory but it is not trivial to include the effect of constraints into the memory estimation. I make this point, because despite 338 TB shown as your estimate, the real requirement might be substantially lower and a compromise not as far out of reach as this number suggests.

@khoonun
Copy link
Author

khoonun commented Sep 5, 2024

Also, I tried to execute your example and found several errors (duplicate names in lists and so on). Please double-check that your code can directly be executed by just copy-pasting it.

Hey, so sorry! I was trying to edit the parameter names and there was some mistakes as correctly pointed out by all. I have now corrected the mistakes. :)

@khoonun
Copy link
Author

khoonun commented Sep 5, 2024

Hi @khoonun, can confirm what @AVHopp wrote. We'd be glad to help but the minimum input we need to expect from you is a copy-paste-able example to reproduce the problem. Right now, when I copy the code as is, I get Syntax errors.

Also, I tried to execute your example and found several errors (duplicate names in lists and so on). Please double-check that your code can directly be executed by just copy-pasting it.

Hey, so sorry! I was trying to edit the parameter names and there was some mistakes as correctly pointed out by all. I have now corrected the mistakes. :)

@khoonun
Copy link
Author

khoonun commented Sep 5, 2024

Could you share some details on what exactly the Equiv parameters are supposed to do? Since there are 12 of those, the higher the resolution is, the number of values of those parameters quickly explodes. Also, can you elaborate on your constraints? I think I have an idea what you want to achieve with your individual constraints but in order to avoid biasing you with my interpretation, it would be better if you could first provide some information on what exactly you want to achieve with each individual constraint. Reason I am asking that depending on what you want to do with the constraints, you might be able to make Equiv paramers continuous, avoiding the combinatorial explosion.

**Hi @AVHopp , Enclosed is one example. Supposedly, the Equiv represents the concentration of the respective ingredients. Hope this explains?

image

3 levels of constraints above:

  1. To ensure individual substance parameter is linked to each Equiv parameters.
  2. To ensure the total amount for each new recommendation does not exceed a certain limit
  3. For cases like X21 and X22 which are based on the same X20 substance dictionary, will select two unique keys from X20 dict, no duplicates.

Hope this explains as well.

I have been thinking of turning to Equiv as a continuous parameter, but with the use of Sustance parameter so that we can leverage of the molecular descriptors to select the compatible chemicals. But, I am not sure whether we can implement here in Baybe yet.**

@khoonun
Copy link
Author

khoonun commented Sep 5, 2024

@khoonun Independent of the issues already raise by the others: The space you seemingly want to build is unsurprisingly huge. There are a lot of parameters and choices - you already mentioned combinatorial explosion.

We have a more memory-efficient implementation of the search space creation including the constraint application. It is automatically enabled if you install baybe with polars depdendency as in pip install baybe[chem,polars]

But I doubt that this will work for the huge space you are trying to build.

There exists no solution to this problem, if your space is too large, it is too large. So the only options you have imo are

  • Reduce the number of parameters and their possible choices
  • Move from a slot-based representation (the one you model here where we have slots corresponding to chemical substance choices and slot-amounts corresponding to the amounts of the slot) to a traditional representation. The traditional representation can be done via continuous parameters which will make the search space very small (it might still be a heavy computation during optimization + doesnt allow for chemical encodings as there are no labels)

I should also mention that the memory number estimated by the utility is the upper limit of memory needed because it does not include the effect of constraints. So it might be that the actual memory you require is <1% of the number shown. I know this is unsatisfactory but it is not trivial to include the effect of constraints into the memory estimation. I make this point, because despite 338 TB shown as your estimate, the real requirement might be substantially lower and a compromise not as far out of reach as this number suggests.

@Scienfitz, all along, I have implemented the baybe [chem, polars]. So, it seems like it does not help too. It is really helpful to know that the real memory requirements could be much less. In this case, I would need to find ways to further investigate this.

For sure, the use of chemical encoding is very useful and we wish to retain that. Given this case, is there a non-traditional representation approach we can adopt here?

@AVHopp
Copy link
Collaborator

AVHopp commented Sep 5, 2024

@khoonun in your example, you do not actually construct the search space object. The potential effect of the constraints can only happen if you attempt to construct the search space (notice how the estimation does only take the parameters into account!), did you already attempt that?

@Scienfitz
Copy link
Collaborator

Scienfitz commented Sep 5, 2024

@khoonun

A non-traditional approach will only work if you're willing to reduce the space.

Example for Traditional Mixture With Sub-Groups and Constraints

# List of substance labels, divided into subgroups
g1 = ["A", "B"]
g2 = ["mol1", "mol2"]
g3 = ["substance1", "substance2"]

# Make continuous concentration parameters for each group
p_g1_concentrations = [
    NumericalContinuousParameter(name=f"{name}", bounds=(0, 20)) for name in g1
]
p_g2_concentrations = [
    NumericalContinuousParameter(name=f"{name}", bounds=(0, 40)) for name in g2
]
p_g3_concentrations = [
    NumericalContinuousParameter(name=f"{name}", bounds=(0, 60)) for name in g3
]

# Ensure total sum is 100
c_total_sum = ContinuousLinearEqualityConstraint(
    parameters=g1 + g2 + g3,
    coefficients=[1.0] * len(g1 + g2 + g3),
    rhs=100.0,
)

# Ensure sum of group 1 is smaller than 40
c_g1_max = ContinuousLinearInequalityConstraint(
    parameters=g1,
    coefficients=[-1.0] * len(g1),
    rhs=-40,
)

# Ensure sum of group 2 is larger than 60
c_g2_min = ContinuousLinearInequalityConstraint(
    parameters=g2,
    coefficients=[1.0] * len(g2),
    rhs=60,
)

searchspace = SearchSpace(
    continuous=SubspaceContinuous(
        parameters=p_g1_concentrations + p_g2_concentrations + p_g3_concentrations,
        constraints_lin_eq=[c_total_sum],
        constraints_lin_ineq=[c_g1_max, c_g2_min],
    ),
)
campaign = Campaign(
    searchspace=searchspace,
    objective=NumericalTarget(name="MyTarget", mode="MAX").to_objective(),
)

rec = campaign.recommend(10)
print(rec)

image

  • no molecular features are involved here
  • the reason you cant use continuous parameters with the slot based approach is because the dependency is not implemented between discrete and conti parameters - this is likely complicated

@khoonun
Copy link
Author

khoonun commented Sep 5, 2024

@khoonun in your example, you do not actually construct the search space object. The potential effect of the constraints can only happen if you attempt to construct the search space (notice how the estimation does only take the parameters into account!), did you already attempt that?

Hi @AVHopp , I have constructed the search space object with the below:
searchspace = SearchSpace.from_product(parameters=parameters, constraints=constraints)

I was using the space size estimation on hindsight to roughly understand the constraint I was facing.

@AdrianSosic
Copy link
Collaborator

AdrianSosic commented Sep 5, 2024

@khoonun, without having followed the entire conversation so far, I'd like to make one important point that might help you:
Our search space constructors are just convenience methods! In most cases, they come in handy, but you don't have to use them!

In particular, the essence of the problem is that these constructors cannot easily account for specific constraints you might have in your use case in a computationally efficient way (that's basically also what @Scienfitz mentioned). For example, from_product will first create the full unconstrained space and then filter it down in a second phase. So even if the filtered result is tiny, you might not even get there if the first step kicks you out of memory.

So two things you can try:

  • Since you have a simplex constraint involved, consider our from_simplex constructor, which implements a very efficient routine at least for that particular type of constraint. If applicable, that will reduce memory consumption by A LOT!!!
  • What you can always do: create your search space manually in a clever way. That is, all you need to construct is the dataframe that lists the candidates you want to screen. You can then directly pass that to from_dataframe and it will take that table as-is – no further risk of OOM involved. That means, as long as you can construct the search space on your own, you are good to go!

@khoonun
Copy link
Author

khoonun commented Sep 5, 2024

@khoonun, without having followed the entire conversation so far, I'd like to make one important point that might help you: Our search space constructors are just convenience methods! In most cases, they come in handy, but you don't have to use them!

In particular, the essence of the problem is that these constructors cannot easily account for specific constraints you might have in your use case in a computationally efficient way (that's basically also what @Scienfitz mentioned). For example, from_product will first create the full unconstrained space and then filter it down in a second phase. So even if the filtered result is tiny, you might not even get there if the first step kicks you out of memory.

So two things you can try:

  • Since you have a simplex constraint involved, consider our from_simplex constructor, which implements a very efficient routine at least for that particular type of constraint. If applicable, that will reduce memory consumption by A LOT!!!
  • What you can always do: create your search space manually in a clever way. That is, all you need to construct is the dataframe that lists the candidates you want to screen. You can then directly pass that to from_dataframe and it will take that table as-is – no further risk of OOM involved. That means, as long as you can construct the search space on your own, you are good to go!

This is very helpful, @AdrianSosic !
This makes a lot of sense, and I will investigate this!
Thank you so much to all!

@AVHopp
Copy link
Collaborator

AVHopp commented Sep 16, 2024

Since there has been no updates here, I assume that the issue was solved. If not, feel free to reopen.

@AVHopp AVHopp closed this as completed Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants