Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two cycle read/write bug #50

Merged
merged 2 commits into from
May 3, 2022
Merged

Fix two cycle read/write bug #50

merged 2 commits into from
May 3, 2022

Conversation

tdb-alcorn
Copy link
Contributor

This change adds transparent queues to all the control queues in the accumulator modules to unblock control flow. When a control queue needs to address more than one subordinate before proceeding, we use a multi enqueue to handle this transaction. This works well when all the subordinates are engaged at the same stage in the data pipeline, but when one subordinate operates at a later stage, it can cause earlier subordinates to have to wait for it unnecessarily. By adding tiny transparent queues to all the inputs to a multi enqueue, we cut this dependency because the later subordinates control flow can simply buffer up. In the future, when we add a multi enqueue that addresses subordinates that might be separated by up to N cycles in the pipeline, we should add a transparent queue of N elements to each control input.

This change also adds a helpful command to the readme and tidies up a few comment messes.

This makes it so that multienqueues never block control flow. Previously
blocked control flow was leading to hiccups in data flow, which in turn
caused more control congestion. The result was generally a doubling in
the number of cycles needed to perform basic function in the
accumulator.
@tdb-alcorn tdb-alcorn requested a review from petrohi May 3, 2022 00:14
@petrohi
Copy link
Contributor

petrohi commented May 3, 2022

On FPGA:

  ResNet20 CIFAR ONNX YOLOv4 Tiny @416 ONNX ResNet50 ImageNet ONNX
A. ZCU104 baseline (ms) 13.099 294.071 514.279
B. A and external AXI4S Width Converter instead of Transmisson 11.168 262.804 473.956
C. B and two-cycle read/write fix 9.661 214.892 415.953

Copy link
Contributor

@petrohi petrohi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResNet20V2 on CIFAR: 10000 images 0.90 accuracy at 103.82 fps

LGTM!!!

@tdb-alcorn tdb-alcorn merged commit 8f17bf2 into main May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants