Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEST behaves incorrectly or fails when connecting sliced layers while using MPI #3108

Open
heplesser opened this issue Feb 20, 2024 · 0 comments · May be fixed by #3132
Open

NEST behaves incorrectly or fails when connecting sliced layers while using MPI #3108

heplesser opened this issue Feb 20, 2024 · 0 comments · May be fixed by #3132
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next T: Bug Wrong statements in the code or documentation
Projects

Comments

@heplesser
Copy link
Contributor

This issue is based on a recent post on the NEST User Mailing List. Thanks to Miriam Kempter for reporting it!

A minimal reproducer with two different cases triggering errors in different locations is

import nest

# segfaults in neurons[pick].spatial / slice_positions_if_sliced_nc() with -np 3
# num_neurons = 7
# pick = 6

# with mask, vector access violation in Connect() / FreeLayer< D >::get_position() with -np 3 
num_neurons = 5
pick = 3

neurons = nest.Create(model='parrot_neuron', n=num_neurons,
                      positions=nest.spatial.free(
                                  pos=nest.random.uniform(min=-1, max=1),
                                  num_dimensions=2,
                                  edge_wrap=False))

print(neurons[pick].spatial)

nest.Connect(neurons[pick], neurons, {'rule': 'pairwise_bernoulli', 'p': 1.0,
                                      'mask': {'circular': {'radius': 2}}
                                          })

# expected output: source pick+1 connected once to each neuron
print(nest.GetConnections())

First issue

In the first case (with 7 neurons), the call to neurons[pick].spatial leads to use of an invalid access here:

sliced_points.push_back( positions[ index ] );

To protect against uncaught invalid access, one should add in

const Token&
operator[]( size_t i ) const
{
return ( *data )[ i ];
}
const Token&
get( long i ) const
{
return data->get( i );
}

an

 assert( index_is_valid( i ) );

Second issue

In the other case (with 5 neurons), the Connect() call in presence of a mask leads to and out-of-bounds exception here

const auto pos = get_position( ( *nc_it ).lid );

which calls

https://github.com/nest/nest-simulator/blob/8e85268ca512e0f509240fb5ef06f62a4e61a67e/nestkernel/free_layer.h#L270-275

Analysis

The most critical parts are probably

  • slice_positions_if_sliced_nc( DictionaryDatum& dict, const NodeCollectionDatum& nc )
  • The various ways of initializing NodeCollection::const_iterator, especially from NodeCollection::MPI_local_begin() and NodeCollection::local_begin() and the increment operators for the iterator.

The logic of these in relation to slicing and MPI needs review.

@heplesser heplesser added T: Bug Wrong statements in the code or documentation S: High Should be handled next I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) labels Feb 20, 2024
@heplesser heplesser added this to To do (open issues) in Kernel via automation Feb 20, 2024
@terhorstd terhorstd added this to the NEST 3.7 milestone Mar 4, 2024
@heplesser heplesser linked a pull request Mar 4, 2024 that will close this issue
3 tasks
heplesser added a commit to heplesser/nest-simulator that referenced this issue Mar 15, 2024
@terhorstd terhorstd removed this from the NEST 3.7 milestone Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next T: Bug Wrong statements in the code or documentation
Projects
Kernel
  
To do (open issues)
Development

Successfully merging a pull request may close this issue.

2 participants