Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace BinaryValue with LogicArray #3244

Closed
wants to merge 42 commits into from

Conversation

aubindetrez
Copy link

@aubindetrez aubindetrez commented Feb 12, 2023

Replace BinaryValue with LogicArray like mentioned in issue #608 and PR #706.
All internal references to BinaryValue are removed.

Please this is a Breaking Changes - do not merge as-is, further discussions are required.

I think BinaryValue should be reintroduced for compatibility reason but marked deprecated. (And next major release it can be definitely removed). Let me know what you think.

Status / To dos:

  • Remove BinaryValue and internal references to it.
  • Pass all tests
    • See list of tests below
  • Implement all missing operators
    • See list below
  • Make type conversion more explicit LogicArray: Make type conversions more explicit #3418
  • Remove Bit
  • Construct LogicArray from String
  • Construct LogicArray from Iterable of Logic
  • Construct LogicArray from integer with explicit endianness
  • Convert LogicArray to String (binary representation)
  • Convert LogicArray to Integer and explicitly specify endianness and representation
  • get / set an element
  • get the length
  • bitwise logic
  • Explicit arithmetic shifts
  • Explicit logical shifts
  • No arithmetic operators with implicit conversion to int...
  • get 'slices' (value = signal[3:0])
  • set 'slices' (signal[3:0] = value)
  • Have comparable performance to BinaryValue
  • Proof of concept: Port one "big cocotb project" to use LogicArray instead of BinaryValue
  • Add meaningful error message to help with porting big code bases
  • cocotb/cocotb/handle.py uses downto by default, do we want to add an option for using to ?
  • Update the documentation (There are many references to BinaryValue)

LogicArray's operators compared to BinaryValue

The list of operators have been moved to: https://gist.github.com/aubindetrez/16ffe3b172ff0edd138001d61437d4ef (to improve clarity.

Failing tests

  • pytest cocotb/tests/pytests/test_logic_array.py depends on BinaryValue
  • Remove pytest cocotb/tests/pytest/test_binary_value.py
  • Test test_matrix_multiplier_icarus from cocotb/cocotb/examples/matrix_multiplier/tests/test_matrix_multiplier.py with error ModuleNotFoundError: No module named 'cocotb.binary'
  • Test test_matrix_multiplier_ghdl
  • Test issue_142_overflow_error from cocotb/tests/test_cases/issue_142/issue_142.py with error Test failed with RANDOM_SEED=1694419089
  • Test test_assigning_structure from cocotb/tests/test_cases/test_cocotb/test_deprecated.py with error Test failed with RANDOM_SEED=1694419168
  • Test test_force_release from cocotb/tests/test_cases/test_force_release/test_force_release.py with error Test failed with RANDOM_SEED=1694419237
  • Test test_my_design.my_first_test from cocotb/cocotb/examples/doc_examples/quickstart with error AssertionError: my_signal_2[0] is not 0!
  • Test test_my_design.my_second_test from cocotb/cocotb/examples/doc_examples/quickstart with error AssertionError: my_signal_2[0] is not 0!
  • issue_142_overflow_error
  • test_force_release

*If checked, the test passed at least locally (nox + pytest) but for the CI you have to scroll down this page.

Performance

@aubindetrez aubindetrez marked this pull request as draft February 12, 2023 15:08
@aubindetrez
Copy link
Author

@ktbarrett I would greatly appreciate your feedback on this.

Copy link
Member

@ktbarrett ktbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against this change. I appreciate the work! I think this PR should come in, but maybe we should fix a few critical bugs and do a 1.8 release before this comes in.

There is some additional development on these types that needs to occur, namely for performance (#2769). Also we need to consider removing Bit (#2666), as I don't think it's really useful.

cocotb/types/logic_array.py Outdated Show resolved Hide resolved
cocotb/types/logic_array.py Outdated Show resolved Hide resolved
cocotb/types/logic_array.py Outdated Show resolved Hide resolved
Comment on lines 169 to 171
elif isinstance(value, ctypes.Structure):
# ctypes.Structure is also typing.Iterable, but it is not supported
raise ValueError(f"{value} is an instance of ctypes.Structure which cannot be converted to LogicArray")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add this since it's been deprecated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to help user switch to LogicArray with a (hopefully) straightforward error.
While migrating big projects it could save even more time.

cocotb/types/logic_array.py Show resolved Hide resolved
cocotb/types/logic_array.py Outdated Show resolved Hide resolved
tests/pytest/test_logic_array.py Outdated Show resolved Hide resolved
cocotb/handle.py Outdated Show resolved Hide resolved
@ktbarrett
Copy link
Member

You should install pre-commit and run pre-commit on your commits. This should make sure that your commits will pass all the lint checks.

pip install pre-commit
pre-commit install

You can run the checks at any time using

pre-commit run -a

aubindetrez added a commit to aubindetrez/cocotb that referenced this pull request Feb 18, 2023
See comment from from @ktbarrett in cocotb#3244 for context:
> This isn't a good idea. It doesn't make it clear to the user which representation
> to interpret the bits as. Instead of making a choice for the user, users should
> make an explicit choice using the .integer and .signed_integer methods.
>
> From PEP 20:
> > Explicit is better than implicit.
> > In the face of ambiguity, refuse the temptation to guess.
aubindetrez and others added 18 commits September 7, 2023 13:11
Enables straighforward BigEndian / LittleEndian convertion.
(And one small description of exception raised by LogicArray.integer)
Co-authored-by: Kaleb Barrett <dev.ktbarrett@gmail.com>
Co-authored-by: Kaleb Barrett <dev.ktbarrett@gmail.com>
Co-authored-by: Kaleb Barrett <dev.ktbarrett@gmail.com>
See comment from from @ktbarrett in cocotb#3244 for context:
> This isn't a good idea. It doesn't make it clear to the user which representation
> to interpret the bits as. Instead of making a choice for the user, users should
> make an explicit choice using the .integer and .signed_integer methods.
>
> From PEP 20:
> > Explicit is better than implicit.
> > In the face of ambiguity, refuse the temptation to guess.
Failed example:
  File "logic_array.py" (doctest)
    la.reverse()
  File "logic_array.py", line 194, in reverse
    return type(self)(reversed(self), range=reversed(self.range))
  File "logic_array.py", line 184, in __init__
    if len(self._value) != len(self._range):
  Error:
    TypeError: object of type 'range_iterator' has no len()

Explanation:
Range.__reversed__() returns an range_iterator and LogicArray's constructor (LogicArray.__init__())
expects a Range.

Fix:
Implement a Range.reverse() function to return a reversed Range

Alternative:
Modify Range.__reversed__() to return a reversed Range instead of a
range_iterator
Fix format by running pre-commit as recommanded by ktbarrett
Example of fail:

Cause:
Previous implementation relies on the convertion to integer to compare two LogicArrays.
It fails when one of the logic array contains X/Z bits.

This commit also contains does not assume the internal binstr representation uses uppercase, as suggested by ktbarret.
Remove reference to binaryValue in test_logic_array.py
Have been suggested by ktbarret:
> Also add a check for the reversal of the range.
BinaryValue have been removed from this branch and replaced with
LogicArray. LogicArray has it's dedicated pytest tests.
Adding __mul__ __imul__ and __rmul__ operand to LogicArray to make it
behave similarly to BinaryValue (required by performance test
test_matrix_multiplier.multiply_test)
@teobiton
Copy link
Contributor

teobiton commented Sep 11, 2023

Hi,

Regarding the status, performance analysis is something to consider as well.
First observations by comparing profiling from the matrix multiplier with BinaryValue and with LogicArray, is that there is a huge drop in performance.

Based on current master and on the remove_binary_value branch:

Branch master remove_binary_value
Most time consuming function binary.py:249(_convert_from_unsigned) logic_array.py:202(integer)
Total time spent (s) 1.294 16.51
Cumulative time spent (s) 1.993 31.4

The source of the performance drop comes from the way the value is handled in LogicArray: each time the integer representation is needed, the array is looped over to build it.
As mentionned in #2769, it would be best to keep to python native types for better performance.

@aubindetrez
Copy link
Author

aubindetrez commented Sep 12, 2023

Yes performance is an issue too and it is in the to do list at the top of the issue, thanks for mentioning it.
Looks like #2769 is about refactoring LogicArray with performance in mind.

Optimizing performance of something not functional could be very error prone. Without the full picture we could be missing optimization opportunities or optimize out things we need. I suggest we get this branch to pass CI first, then refactor LogicArray to improve performance.
Unless we can do both in parallel ?!

EDIT: I assume we want to use LogicArray instead of BinaryValue anyway and that we don't want to introduce a new type.

Test `test_my_design.my_first_test` and `test_my_design.my_second_test` are
doing comparison with a bit and an integer.
In order to make LogicArray behave similarly to BinaryValue we also need to
implement comparison between Bit and int.

I also add a doctest in logic_array.py to make sure we don't break it when
refactoring LogicArray to improve performance. (It is not in logic.py because we
may remove it when refactoring)
aubindetrez added a commit to aubindetrez/cocotb that referenced this pull request Sep 13, 2023
See comment from from @ktbarrett in cocotb#3244 for context:
> This isn't a good idea. It doesn't make it clear to the user which representation
> to interpret the bits as. Instead of making a choice for the user, users should
> make an explicit choice using the .integer and .signed_integer methods.
>
> From PEP 20:
> > Explicit is better than implicit.
> > In the face of ambiguity, refuse the temptation to guess.
@aubindetrez
Copy link
Author

aubindetrez commented Sep 13, 2023

I tried to "cache LogicArray.integer" in a separate branch (look for "Experiment 1: 'Cache' self.integer" in first top of the PR). I get significant performance improvements with a simple change, but it's nowhere near BinaryValue.

I will keep experimenting with different implementations and share results and code in the first comment, at the top of the PR. If anyone else want to give it a try, you can comment below with your results/branch and I will add it to the list too.

Python 3.7 needs __int__ to perform integer conversion where Python 3.8+ uses index()
This is a fix for the CI / python3.7 tests
@aubindetrez
Copy link
Author

aubindetrez commented Sep 14, 2023

The latest commit passes CI.

FYI: To run python3.7 tests locally, I used nox -s dev_test-3.7 with:

--- a/noxfile.py
+++ b/noxfile.py
@@ -123,7 +123,7 @@ def dev_build(session: nox.Session) -> None:
     session.warn("No building is necessary for development sessions.")
 
 
-@nox.session
+@nox.session(python=["3.10", "3.7", "3.8"])
 def dev_test(session: nox.Session) -> None:
     """Run all development tests as configured through environment variables."""

Just in case someone else needs to debug python3.7 compatibility.

EDIT: I'm currently working on improving the performance and writing new performance benchmarks.

@imphil
Copy link
Member

imphil commented Sep 18, 2023

Thanks for your work, @aubindetrez! I just had a chat with @ktbarrett (and he'll probably jump in with more details), but just as a first notice:

LogicArray is (meant to be) not interpreting the data automatically, i.e. it only knows about bits (and Xes, etc.), but not about signed/unsigned, floats, byte ordering, etc. That means we cannot have mathematical operations and the like directly on LogicArray. Instead, users need to explicitly use LogicArray.signed_integer and friends and operate on them.

While looking at this issue I thought about ways to make the API more explicit, please have a look at #3418 and join the discussion.

@ktbarrett
Copy link
Member

I think we can also remove Bit as it currently serves no purpose. There was interest by me originally to add more types, but I think we aren't doing that anymore.

@ktbarrett
Copy link
Member

ktbarrett commented Sep 18, 2023

Generally what we need is:

  • construct from string
  • construct from iterable of logic
  • construct from integer specifying the endianness (representation can be deduced since we only support two's complement and unsigned, which is a subset)
  • conversion to binary string
  • conversion to integer specifying the endianness and representation
  • sequence-like stuff: get and set elements, iteration, length, blah blah blah
  • bitwise logic

What we cannot assume:

  • shifts (are they arithmetic or logical)
  • any arithmetic operators
  • how users want hex and octal representations formatted

@teobiton
Copy link
Contributor

Hi all,

To expand on the performance trials, I implemented an int and string representation for LogicArray that is used as much as possible when valid, i.e not modified by set_item and other arithmetic operations that are implemented for now (see logicarray_perf_repr).

The results are far better than the original implementation but we are nowhere near the results with BinaryValue per the construction of LogicArray inheriting from Array[Logic]. A lot of time is spent rebuilding the array even with the string and int representations (total time 7 seconds for binstr, integer has better results around 4 seconds total time).

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Patch coverage: 47.95% and project coverage change: +0.12% 🎉

Comparison is base (d56692a) 43.78% compared to head (eec5994) 43.91%.
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3244      +/-   ##
==========================================
+ Coverage   43.78%   43.91%   +0.12%     
==========================================
  Files          49       48       -1     
  Lines        8863     8433     -430     
  Branches     2458     2356     -102     
==========================================
- Hits         3881     3703     -178     
+ Misses       4401     4190     -211     
+ Partials      581      540      -41     
Files Changed Coverage Δ
cocotb/types/logic.py 66.21% <33.33%> (+3.05%) ⬆️
cocotb/types/logic_array.py 53.14% <43.75%> (+8.14%) ⬆️
cocotb/types/range.py 59.30% <50.00%> (+6.86%) ⬆️
cocotb/handle.py 78.70% <80.00%> (-0.17%) ⬇️
cocotb/types/__init__.py 25.00% <100.00%> (ø)

... and 13 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

References to the Bit type are removed from documentation and
tests.
@ktbarrett
Copy link
Member

ktbarrett commented Sep 21, 2023

@teobiton The next stumbling block is probably the construction of the Range object, which occurs if no range is given. You can test that by creating a Range object once and passing it in the range argument in your benchmark and see if performance improves.

We can then investigate maintaining multiple Range objects as well.

I wonder if we should start from the ground up with performance in mind (at least for the LogicArray class).

@ktbarrett
Copy link
Member

ktbarrett commented Sep 21, 2023

@teobiton Actually, I see you are still creating the List[Logic] repr even when being constructing from a string. You should populate the str represent instead of the List[Logic] repr when given a string and create the List[Logic] impl on demand. Most of the performance issues come from constructing this list. It requires the creation of N Logic values and a list to hold them.

An easy way to do this is with getter which encapsulate the creation. The only time you need to set the string and int representations is on construction and I don't know if we need to cache them after that fact, so once they are invalidated on a __setitem__, they are permanently invalidated. Although, you might go super extra and do general-purpose caching on the string and/or int representations like you are doing rn.

A small mock up
class LogicArray:
    @overload
    def __init__(self, value: str):
        ...

    @overload
    def __init__(self, value: int, *, byteorder: str) -> None:
        ...

    @overload
    def __init__(self, value: SizedIterable[LogicConstructible]) -> None:
        ...

    def __init__(self, value, *, byteorder: str):
        self._str_impl_valid = False
        self._list_impl_valid = False
        self._int_impl_valid = False
        if isinstance(value, str):
            self._str_impl = value
            # validate
            self._str_impl_valid = True
        elif isinstance(value, int):
            self._int_impl = value
            self._int_impl_byteorder = byteorder
            # validate
            self._int_impl_valid = True
        else:
            self._list_impl = list(value)
            # validate
            self._list_impl_valid = True

    def __str__(self) -> str:
        if self._str_impl_valid:
            return self._str_impl
        # otherwise construct from int or logic representation

    def as_int(self, *, byteorder: str, signed: bool) -> int:
        if self._int_impl_valid and self._int_impl_byteorder == byteorder:
            # might need to work out if two's complement is needed if 'signed' is False and '_int_impl' is negative
            return self._int_impl
        # otherwise construct from str, int, or logic representation

    @property
    def _list_impl(self) -> List[Logic]:
        if self._list_impl_valid:
            return self._list_impl_
        # construct from str or int representation

    def __iter__(self) -> Iterable[Logic]:
        return iter(self._list_impl)

    def __reversed__(self) -> Iterable[Logic]:
        return reversed(self._list_impl)

    def __setitem__(self, key: int, value: LogicConstructable) -> None:
        self._int_impl_valid = False
        self._str_impl_valid = False
        # validate
        return self._list_impl[index]

@marlonjames marlonjames added the category:codebase:handle relating to handles or handle types (BinaryValue) label Oct 12, 2023
@ktbarrett
Copy link
Member

Where are we with this?

@aubindetrez
Copy link
Author

We (I) still need to change the API as discussed by Philipp and you. This will break many tests and examples.

We also need to improve performance further.

I personally don't have a lot of free time right now but I hope it should get better in a few weeks. If anyone else wants to work on it he/she is more than welcome.

@teobiton
Copy link
Contributor

teobiton commented Nov 3, 2023

So I picked up things where I left them and tried a new implementation based on your comments @ktbarrett, still on the same branch logicarray_perf_repr). It's not quite done yet.

Do we still want a Range being constructed and inheriting from Array[Logic] ? In my current trial I totally removed them from the class (breaking everything obviously), only building the Logic(_) array when required.

@marlonjames
Copy link
Contributor

@ktbarrett Is this PR superseded by your other handle PRs?

@ktbarrett
Copy link
Member

Superceded by #3634.

@ktbarrett ktbarrett closed this Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:codebase:handle relating to handles or handle types (BinaryValue)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants