Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC for a GPIO peripheral. #49

Merged
merged 7 commits into from
Mar 8, 2024

Conversation

jfng
Copy link
Member

@jfng jfng commented Feb 27, 2024

@jfng jfng marked this pull request as draft February 27, 2024 12:12
* Fix SVG diagrams.
* Add open-drain mode.
* Remove ISync register.
* Remove pin count limit.
* Rename module to `amaranth_soc.gpio`.
@whitequark whitequark added meta:nominated Nominated for discussion on the next relevant meeting area:soc RFC affecting APIs in amaranth-lang/amaranth-soc labels Mar 1, 2024
Also, add white backgrounds to SVG diagrams.
@jfng jfng marked this pull request as ready for review March 1, 2024 15:45
@whitequark
Copy link
Member

whitequark commented Mar 1, 2024

I have a few questions and comments:

  • What is the behavior if a single write has both set_N and clr_N bits set (for the same N)?
    • I think the most reasonable option is "nothing happens". This doesn't really increase gate count and it's better than arbitrarily picking set-over-clear or clear-over-set. It would be a per-bit rule, i.e. bit N can change state iff set_N ^ clr_N.
      • Also, if for some reason an all-ones pattern is written into the register (e.g.: the DMA peripheral encountered a bus error traversing a bitbang pattern) the output won't get corrupted.
  • What are the exact rules for the layout of SetClr register?
  • What happens if the unallocated 0b11 bit pattern is written into the mode field for a pin?
    • If we choose to leave it unallocated, I think it should do the same thing as INPUT_ONLY and that we should document it as "currently it aliases INPUT_ONLY but this should not be used as in a later version we reserve the right to change the meaning".
    • Given that it is the last unallocated bit pattern, we could allocate it with the following semantics: if pin mode is set to ALTERNATE = 0b11, then the GPIO peripheral is considered to no longer control that pin and something else in the SoC takes precedence. In this case, the pins would come up as GPIO inputs after reset (the safest possible mode) and then could be switched to output or alternate mode afterwards.
      • An output port alt_mode that is pin_count wide could be added in this case that indicates, for every pin, whether it's controlled by the GPIO peripheral or by some other one. An external to the GPIO peripheral pin multiplexer could use this port to determine whether to route o/oe signlas from the GPIO peripheral or some other peripheral.
      • Because every SoC will have GPIOs, making them the foundation of I/O, it seems reasonable to have the GPIO mode register override everything else in the SoC; then, drivers that wish to bitbang a pin (which can be necessary, for example, to work around a buggy interface peripheral) can safely do this at any point by switching the mode to INPUT or PUSH_PULL without needing to know any more details about more complex peripherals.
      • When pin mode is set to ALTERNATE, the meaning of i, o, and oe fields for a pin would be implementation-defined rather than defined by the GPIO peripheral itself.
      • This is basically the escape hatch for implementation-defined behavior that is less invasive than reimplementing the whole peripheral, so allocating the last bit pattern to it seems like a good fit.
      • When the ALTERNATE mode is not implemented it should be equivalent to INPUT_ONLY.
  • What are the reset values of all the RW registers?
  • Are the synchronizer flops reset by the peripheral sync domain reset? (Depending on the answer here, the inputs could potentially glitch in the first few cycles after reset. Not sure if it's important.)
  • It doesn't seem necessary to limit input_stages to at most 2; I think accepting any non-negative number is fine.

@jfng
Copy link
Member Author

jfng commented Mar 1, 2024

I have a few questions:

* What is the behavior if a single write has both `set_N` and `clr_N` bits set (for the same N)?
  
  * I think the most reasonable option is "nothing happens". This doesn't really increase gate count and it's better than arbitrarily picking set-over-clear or clear-over-set. It would be a per-bit rule, i.e. bit N can change state iff `set_N ^ clr_N`.
    
    * Also, if for some reason an all-ones pattern is written into the register (e.g.: the DMA peripheral encountered a bus error) the output won't get corrupted.

It would have been set-over-clear, for consistency with the CSR API. Though in the case of RW1S and RW1C, the set/clear operations originate from different sides and both are expected to happen in normal conditions. This is not the case here, and I agree that set^clear would make more sense.

* What are the exact rules for the layout of `SetClr` register?

2 ** ceil_log2(pin_count) bits for the set_x array, with the lower pin_count bits occupied by 1-bit W fields. Same for the clr_x array.

* What happens if an undocumented bit pattern is written into the mode register?
  
  * I think it should do the same thing as `INPUT_ONLY` and that we should document it as "currently it aliases `INPUT_ONLY` but this should not be used as in a later version we reserve the right to change the meaning".
  * Given that it is the last unallocated bit pattern, one possible semantics for a future allocation of it would be: if pin mode is set to 0b11, the GPIO peripheral no longer controls that pin and something else in the SoC takes precedence. So the pins would come up in GPIO input mode by default but then could be released to be used by other modules.
    
    * This is (largely; assuming all other peripherals are disabled by default and do not take control of the pins before they're explicitly initialized) backward compatible with `INPUT_ONLY`.
    * An output port could be added in this case that indicates, for every pin, whether it's controlled by the GPIO peripheral or by some other one.

Given that the Input register is currently always sampling pin.i regardless of the mode, defaulting to INPUT_ONLY for 0b11 seems reasonable. We could also document that the value of Input is undefined for Mode = 0b11 for now.

* What are the reset values of all the RW registers?

Zero.

* Are the synchronizer flops reset by the peripheral `sync` domain reset? (Depending on the answer here, the inputs could potentially glitch in the first few cycles after reset. Not sure if it's important.)

In my prototype, they use reset_less=True. I don't know whether it matters in the context of this peripheral.

* It doesn't seem necessary to limit input_stages to at most 2; I think accepting any non-negative number is fine.

Yes, this would be consistent with FFSynchronizer (except it also does not allow less than 2 stages).

@whitequark
Copy link
Member

I've edited my comment to add an idea for allocating the 0b11 bit pattern for the mode field.

@jfng
Copy link
Member Author

jfng commented Mar 1, 2024

Having an ALTERNATE mode as a fourth, implementation-dependent, mode seems reasonable. It would allow downstream implementations to provide an arbitrary number of additional modes controlled by other registers.

@whitequark
Copy link
Member

whitequark commented Mar 1, 2024

One more comment:

  • We could document our policy for extending the GPIO peripheral when necessary. E.g.: "If or when we add registers for new optional features, such as pull-ups, switchable schmitt triggers, switchable output driver strengths, etc, each register will always reside at the same fixed (for a given pin count) address regardless of which features are enabled, and each of these registers will be all-0s after reset, where such all-0s value will provide behavior identical to the behavior of the peripheral without the optional feature. Slots in the address space will never be reallocated with a different meaning once allocated upstream in Amaranth SoC."
    • This will be important to industry users caring about forward and cross-family/cross-configuraiton compatibility.
    • In a perfect world this would be our policy for every peripheral. Realistically, we'll only be able to provide this strongest guarantee for only a subset of peripherals.

@whitequark
Copy link
Member

Comment:

  • If we guarantee that the SetClr register is power-of-2 sized, then all the rest should also be padded to be power-of-2 sized. (This can come up with e.g. 24 pins in a 8-bit data width bus.)

@tpwrules
Copy link
Contributor

tpwrules commented Mar 4, 2024

Some minor questions/comments:

  • For the ALTERNATE mode, there would have to be some output to an external entity that a particular pin is in that mode. Would the GPIO peripheral still be directly connected to the pin (and have additional connections to the external entity to pass these signals on), or rely on the external entity to mux the pin between itself and the peripheral? I vote the latter for implementation simplicity and later flexibility.
    • I would say to leave the input behavior as undefined, some chips have flags to disable input buffers in e.g. analog alternate modes so defining it could be overly constraining. If defined this would also mean the external entity has to still pass the input through (maybe it could be "mode-defined" instead).
  • The example SetClr array seems to be twice as big as necessary, for four pins there are only 8 bits necessary in it, but 16 are illustrated. The earlier response suggests that 8 are intended, and this is what I'll assume for future discussion:
    • 2 ** ceil_log2(pin_count) bits for the set_x array, with the lower pin_count bits occupied by 1-bit W fields. Same for the clr_x array.

  • I don't think it's necessary or desirable that addr_width be mandatory. Computing an exact value is difficult as it relies on the exact details of the peripheral and the CSR machinery. Most users wouldn't care; any overlaps or insufficient width should be caught by other parts of the machinery. Removing it entirely should be feasible but making it optional would probably be better.

The major concern that this raises to me is a lot of questions surrounding the balance of flexibility vs documentability. I think the CSR machinery provides excellent flexibility that this peripheral could take advantage of, but actually using that may create a peripheral that's too difficult to explain, and so the peripheral should be designed in this light and excessively complex configurations be prohibited. For context, I am familiar with microcontroller peripherals, where the system has a specific bus width (8, 16, or 32) and peripheral registers are always exactly this width.

Some specific questions related to the above:

  • For the purpose of these hypotheticals, I propose we interleave the set and clear bits in SetClr to avoid its own questions about rounding. It should always be 2*pin_count, bit 0 sets pin 0, bit 1 clears pin 0, bit 2 sets pin 1, etc.
  • Do we round up registers to power-of-2 (multiple of data_width, which need not be a power of 2) sizes? If yes, then there's excessive latency from narrow busses.
    • It appears this is the behavior since csr.RegisterMap became csr.Builder, however, this aspect was not a part of the RFC since this happened after that was decided to be merged.
    • Are non-power-of-two data_width sizes intended to be supported in general? A quick glance shows this might break some stuff.
  • If we don't round sizes, do we at least align addresses to a power of two? If yes, this simplifies decoding logic.
  • Do we have some registers which are pin_width and others which are 2*pin_width or are all the same size? If all the same size, I assume they would be padded with ResR0W0. Padding would introduce additional latency. If we don't pad, then do we align here? Does this mean that peripherals are expected to have all their registers be the same size?
  • What about bits which are added to pad to a multiple of data_width? Are they implicitly or explicitly ResR0W0? The CSR machinery can pad here, but this could create two types of "don't access" bit in a particular word: those that are a part of the register and those implied by the CSR machinery, which could confuse a documentation generator.
  • How does granularity fit into all this? Usually even 32 bit microcontrollers are byte-addressed, but the CSR machinery generally works in word addresses. Would the granularity influence the documentation? Should peripherals always use a granularity of 8?

@whitequark
Copy link
Member

whitequark commented Mar 4, 2024

  • For the ALTERNATE mode, there would have to be some output to an external entity that a particular pin is in that mode. Would the GPIO peripheral still be directly connected to the pin (and have additional connections to the external entity to pass these signals on), or rely on the external entity to mux the pin between itself and the peripheral? I vote the latter for implementation simplicity and later flexibility.

    • I would say to leave the input behavior as undefined, some chips have flags to disable input buffers in e.g. analog alternate modes so defining it could be overly constraining. If defined this would also mean the external entity has to still pass the input through (maybe it could be "mode-defined" instead).

In my proposal, selecting ALTERNATE mode delegates the meaning of bits in Input and Output registers to the wider SoC implementation, or perhaps a custom derived implementation/wrapper of this GPIO register. Nothing is guaranteed about them at all. The pin multiplexer could choose to wire Input to the input buffers, or disable the buffers entirely, or do something else completely.

  • How does granularity fit into all this? Usually even 32 bit microcontrollers are byte-addressed, but the CSR machinery generally works in word addresses. Would the granularity influence the documentation? Should peripherals always use a granularity of 8?

The entire register is always accessed simultaneously. In general, on a 32-bit MCU, I will recommend using a 32-bit CSR bus unless otherwise needed. In this case every access is guaranteed to be 1-cycle, 0-latency. Otherwise the bridge will take 4 cycles to do a register write, which is rarely worth it if you can spend the area.

It is common for peripherals to support e.g. only word accesses, but return undefined data on half or byte accesses. (STM32 does this for example.)

…nth meeting.

- Add support for an ALTERNATE mode.
- Use interleaved set/clear bits in the SetClr register.
- Remove upper bound on the input_stage parameter.
- Add hints towards a policy for backward-compatible peripheral extensions.
@whitequark
Copy link
Member

Three of the register diagrams are unreadable on black background again.

@whitequark
Copy link
Member

@jfng You have not clarified the behavior on writing to both set_x and clr_x.

Also, I'm not sure if we can use that naming--won't it be pin_x.set and pin_x.clr if we use a field array?

@jfng
Copy link
Member Author

jfng commented Mar 5, 2024

@jfng You have not clarified the behavior on writing to both set_x and clr_x.

My bad.

Also, I'm not sure if we can use that naming--won't it be pin_x.set and pin_x.clr if we use a field array?

Correct. Each register should only need a field array, where each item is associated to a pin.

@whitequark
Copy link
Member

Correct. Each register should only need a field array, where each item is associated to a pin.

Right, so I guess the diagram should be amended for that?

@whitequark
Copy link
Member

We have discussed this RFC on the 2024-03-08 SoC meeting. The disposition was to merge.

@whitequark whitequark merged commit c1573d9 into amaranth-lang:main Mar 8, 2024
@whitequark whitequark removed the meta:nominated Nominated for discussion on the next relevant meeting label Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:soc RFC affecting APIs in amaranth-lang/amaranth-soc
3 participants