Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel derivative coloring with vector design variable can yield zero derivatives #2919

Closed
Asthelen opened this issue May 17, 2023 · 4 comments · Fixed by #3069
Closed
Assignees
Labels

Comments

@Asthelen
Copy link

Description

When testing parallel_deriv_color with a new model that has two parallel groups, derivatives were observed to be zero for one of the design variables--a vector with indices connected to each parallel group. This has worked for other models, so it is unclear what they key difference is.

Nonetheless, the example below seems to show this same behavior. When running compute_totals with 2 processors (1 per group), coloring yields total derivatives of

{('parallel_group.dummy_comp1.y', 'dvs.x'): array([[1., 3.]]), ('parallel_group.dummy_comp2.y', 'dvs.x'): array([[-0., -0.]])}

Without coloring, the derivatives are correct:

{('parallel_group.dummy_comp1.y', 'dvs.x'): array([[ 1., -0.]]), ('parallel_group.dummy_comp2.y', 'dvs.x'): array([[-0., 3.]])}

When using more than 2 processors, coloring yields x derivatives of all zeros.

Example

import openmdao.api as om

class DummyComp(om.ExplicitComponent):
    def initialize(self):
        self.options.declare('a',default=0.)
        self.options.declare('b',default=0.)
    def setup(self):
        self.add_input('x')
        self.add_output('y', 0.)
    def compute(self, inputs, outputs):
        outputs['y'] = self.options['a']*inputs['x'] + self.options['b']
    def compute_jacvec_product(self, inputs, d_inputs, d_outputs, mode):
        if mode=='rev':
            if 'y' in d_outputs:
                if 'x' in d_inputs:
                    d_inputs['x'] += self.options['a'] * d_outputs['y']

class DummyGroup(om.ParallelGroup):
    def setup(self):
        self.add_subsystem('dummy_comp1',DummyComp(a=1,b=2.))
        self.add_subsystem('dummy_comp2',DummyComp(a=3.,b=4.))

class Top(om.Group):
    def setup(self):
        self.add_subsystem('dvs',om.IndepVarComp(), promotes=['*'])
        self.dvs.add_output('x',[1.,2.])
        self.add_subsystem('parallel_group',DummyGroup())
        self.connect('x','parallel_group.dummy_comp1.x',src_indices=[0])
        self.connect('x','parallel_group.dummy_comp2.x',src_indices=[1])

prob = om.Problem()
prob.model = Top()
prob.model.add_design_var('x',lower=0.,upper=1.)

# None or string
deriv_color = 'deriv_color'

# compute derivatives for made-up y constraints in parallel
prob.model.add_constraint('parallel_group.dummy_comp1.y',
                          lower=1.0,
                          parallel_deriv_color=deriv_color)
prob.model.add_constraint('parallel_group.dummy_comp2.y',
                          lower=1.0,
                          parallel_deriv_color=deriv_color)

prob.setup(mode='rev')
prob.run_model()
prob.check_totals(compact_print=True,
                  show_progress=True,
                  directional=True)

OpenMDAO Version

3.26.1-dev

Relevant environment information

I am using python 3.9.5 with the following package versions (among other packages):

numpy==1.24.2
scipy==1.10.1
mpi4py==3.1.4
petsc4py==3.15.0

@naylor-b
Copy link
Member

I was able to make this work if I converted the indep variable 'x' to a distributed variable by changing

self.dvs.add_output('x',[1.,2.])

to this

if self.comm.rank == 0:
    self.dvs.add_output('x', [1.], distributed=True)
else:
    self.dvs.add_output('x', [2.], distributed=True)

This causes the src_indices of [0] and [1] to reference values on different procs so the parallel derivatives work correctly. Your example uses src_indices of [0] and [1] to reference into a non-distributed variable which, because of how our transfer scheme currently works, causes both indices to refer to the 0 and 1 entries of the variable on rank 0 so we lose the separation between procs that makes parallel deriv coloring work.

Is converting your indep var to a distributed variable an acceptable workaround?

@Asthelen
Copy link
Author

Thanks for looking into this.

Unfortunately, I don't think this fixes the MPhys code I had first noticed this issue in.

That code runs two aerostructural scenarios in parallel, with each scenario potentially using more than 1 processor. I think that's a common situation where MPhys folks may want to use derivative coloring: where you're trying to enforce lift constraints at multiple flight conditions by trimming angle of attack.

Anyway, if this workaround can be made to work with >2 processors, then maybe that would suffice.

@naylor-b
Copy link
Member

I think that this one has been fixed now as part of my current refactor of our reverse transfers (on my transfers5 branch). If you have any time to try it out, let me know if it fixes this issue for you.

@Asthelen
Copy link
Author

That's great! That aerostructural MPhys example works now, when I change aoa1 and aoa2 back to a vector of aoa. The above example of course works as well. I'll check the other two issues now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants