Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Sign upswarm/network/stream: added pure retrieval test (syncing disabled) #1355
Conversation
holisticode
added
the
test
label
Apr 26, 2019
holisticode
requested review from
nonsense,
janos and
zelig
Apr 26, 2019
holisticode
self-assigned this
Apr 26, 2019
holisticode
referenced this pull request
Apr 26, 2019
Closed
swarm/network/stream: added pure retrieval test (no syncing) #19502
zelig
added this to Backlog
in Swarm
via automation
Apr 29, 2019
zelig
moved this from Backlog
to In review
in Swarm
Apr 29, 2019
zelig
requested changes
May 1, 2019
| log.Info("Starting simulation") | ||
|
|
||
| result := sim.Run(ctx, func(ctx context.Context, sim *simulation.Simulation) error { | ||
| nodeIDs := sim.UpNodeIDs() |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
holisticode
May 3, 2019
Author
Strictly speaking, if we want the UpNodeIDs() (the nodes which are Up), then we should query this information when the simulation is started, thus inside sim.Run()
| cnt := 0 | ||
|
|
||
| REPEAT: | ||
| for { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
zelig
May 1, 2019
that would also make it concurrent, which is also better for testing realistic scenarios.
I would just allocate chunks to nodes to retrieve them independently, otherwise caching may distort the pure nature of this test
This comment has been minimized.
This comment has been minimized.
holisticode
May 3, 2019
Author
I am not sure I understand what you mean here and I am also not sure what action I should take on your comments. I have thus currently not changed anything
| return fmt.Errorf("No filestore") | ||
| } | ||
| fileStore := item.(*storage.FileStore) | ||
| for _, chunk := range chunks { |
This comment has been minimized.
This comment has been minimized.
zelig
May 1, 2019
if we retrieve each chunk from each node, there is a lot of caching happening, so it may shield retrieval issues
This comment has been minimized.
This comment has been minimized.
holisticode
May 3, 2019
Author
In some way I agree but I finally disagree.
Retrieval in Swarm is intrinsically linked to caching. This is something you yourself have been raising all the time, when saying that we actually can't switch off caching.
So my conclusion is that this in turn is in fact also part of retrieval, so we should leave it as-is. I would furthermore not even know how to circumvent this situation, and finally, you should be able to specify what you mean by "retrieval issues" in order to do something about it.
| for _, chunk := range chunks { | ||
| reader, _ := fileStore.Retrieve(context.TODO(), chunk.Address()) | ||
| //check that we can read the file size and that it corresponds to the generated file size | ||
| if s, err := reader.Size(ctx, nil); err != nil || s != int64(chunkSize) { |
This comment has been minimized.
This comment has been minimized.
|
|
||
| // second iteration: store chunks at the nodes they would be | ||
| // expected to be | ||
| log.Debug("storing every chunk at correspondent node store") |
holisticode commentedApr 26, 2019
his PR adds a pure retrieval test to the snapshot retrieval tests.
Up to today, the retrieval tests were uploading chunks randomly to nodes and then trying to download those chunks from every node. To upload chunks, the FileStore.Store() function was being used - essentially uploading chunks to individual nodes directly. For retrieval to work in such a scenario, we need to have syncing enabled in order for chunks to be stored at their actual expected nodes.
This new test bypasses syncing by evaluating which chunk would be expected to be at which node, then storing chunks directly in those nodes's LocalStore - essentially simulating syncing by accessing stores directly (simulation framework allows that). This way, the test simulation run can switch off syncing and directly starts retrieving chunks, as if syncing would have already terminated.