Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial FASM generator #57

Merged
merged 10 commits into from Apr 14, 2021
Merged

Conversation

acomodi
Copy link
Contributor

@acomodi acomodi commented Apr 7, 2021

At the moment, we need to still come up with a good enhancement of the schema to support FASM features annotations to be added to the DeviceResources for a more data-driven approach.

With the current initial implementation, FASM features a quite fixed, and, even though they can still be made more data-driven, some specialization may be required to handle different devices families, hence the adoption of a inheritance based generator, such that different families can change the implementation or have more detailed functions to add and deal with some possibly complex FASM annotations.

This initial implementation is able to generate valid FASM outputs for the const_wire and wire from nextpnr which do work on HW.

This PR was opened mainly to get an early review of the progress.

TODO:

  • Add capability to emit FASM features based on cells attributes
  • Initial handling of routing bels
  • Handle other cells types:
    • LUTs
    • FF
    • BUFG/BUFH
  • Add tests

This function generates all features corresponding to the physical routing
PIPs present in the physical netlist.

At the moment, the convention for the PIPs FASM features is:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To consider:

  • prjoxide (and prjuray) currently uses <TILE_NAME>.PIP.<WIRE1>.<WIRE0> to distinguish pips better from other features
  • prjoxide bitstream tile names don't correspond to the interchange tile names, because prjoxide can have multiple bitstream tiles at one grid location that get merged into one interchange tile

I think some extra annotations in the database could solve this, for example a per-pip fasm prefix (StringIdx means this will be cheap to store as many pips will share the same prefix).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, for the additional PIP annotation in the feature there might be two solutions:

  • pass a pattern to correctly fill the FASM features, as all PIPs should have a tile_prefix and two wires, this should be easily doable
  • begin a transition process to have a standard in the various databases, such that all adopt the same convention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a way to specify the FASM feature formatting so to add the .PIP. part.

I think that enabling the FASM prefixes from the device resources database can be done in follow-up PRs

self.cells_features.append("{}.{}.{}".format(
cell_data.tile_name, iob_site, feature))

def handle_site_thru(self, site_thru_pips):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think handling of LUT route-through PIPs is going to be needed here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I modified the FF test in nextpnr to get to use FFs in SLICEs (happens often in small tests that the ILOGIC/OLOGIC FFs are used instead) and this indeed showed the need of handling these PIPs. I might generalize a bit more how the LUT truth table is generated to better handle this case.

I guess it will be critical to correctly handle cases where, for instance, the LUT6 is used as a route-thru and the LUT5 is used as a normal LUT (I guess this is possible right?) therefore requiring the final FASM feature to enable both of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the LUT6 is used as a route-thru and the LUT5 is used as a normal LUT (I guess this is possible right?)

yes, I believe things like this can definitely occur

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI there are 3 ways LUTs get written:

  • Direct LUT cell placements
  • Pseudo-site pips (as wire) during site routing
  • Psuedo pips (as wire) during general routing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added LUT-thru handling, and double checked with a slightly modified FF tests which yields LUT-thrus in the physical netlist.

@acomodi
Copy link
Contributor Author

acomodi commented Apr 9, 2021

@litghost @gatecat with the current developement of the FASM generator (4205491) I was able to get a working FASM for the counter test, which includes clocking resources and FFs.

There are two problems though which need to be solved:

  • There are some extra features corresponding to the clock tree that need to be emitted, but the information on which of these feature needs to be emitted has to be extracted starting from the PIP which was used. To make an example we have this features which are missing from the FASM generator output:
222,223d221
< CLK_BUFG_REBUF_X60Y38.GCLK31_ENABLE_BELOW
<
228,229d225
< CLK_BUFG_REBUF_X60Y90.GCLK31_ENABLE_ABOVE
<
274,275d269
<
< HCLK_CMT_X8Y78.HCLK_CMT_CK_BUFHCLK11_USED

The HCLK_CMT_X8Y78.HCLK_CMT_CK_BUFHCLK11_USED depends on the fact that there is one HCLK_CMT_[LR] tile that has an active pip, and it should be in the same clock region. I believe this information can be extracted from the device resources, so I guess it is not that big of a problem.
To note that, even without these features though, the nextpnr+FASM-generator bitstream works as expected.

  • The other issue is that the current version of the nextpnr-fpga_interchange results in a non-working counter on HW. I reverted back to a working revision and noticed that one of the difference between working and non-working is that, in the latter, the input clock is routed through a BUFG pseudo-pip and the output of the BUFG pseudo-pip is fed in yet another BUFG. There might be other changes which affect the results, and I'll need to investigate what might be the cause.

BUFG route-through
Screenshot from 2021-04-09 18-13-05

@acomodi acomodi changed the title WIP: Initial FASM generator Initial FASM generator Apr 9, 2021
@gatecat
Copy link
Contributor

gatecat commented Apr 9, 2021

The HCLK_CMT_X8Y78.HCLK_CMT_CK_BUFHCLK11_USED depends on the fact that there is one HCLK_CMT_[LR] tile that has an active pip, and it should be in the same clock region. I believe this information can be extracted from the device resources, so I guess it is not that big of a problem.

I think some kind of generic pip/wire-to-extra-feature map might be needed long term. iirc the UltraScale+ also has features that must be set whenever the node that contains a wire is used, even when the driving/sinking pip are in different tiles to that feature.

@gatecat
Copy link
Contributor

gatecat commented Apr 9, 2021

The other issue is that the current version of the nextpnr-fpga_interchange results in a non-working counter on HW. I reverted back to a working revision and noticed that one of the difference between working and non-working is that, in the latter, the input clock is routed through a BUFG pseudo-pip and the output of the BUFG pseudo-pip is fed in yet another BUFG. There might be other changes which affect the results, and I'll need to investigate what might be the cause.

can you upload the failing DCP so I can take a peek at it?

@acomodi
Copy link
Contributor Author

acomodi commented Apr 9, 2021

I think some kind of generic pip/wire-to-extra-feature map might be needed long term. iirc the UltraScale+ also has features that must be set whenever the node that contains a wire is used, even when the driving/sinking pip are in different tiles to that feature.

Indeed, I need to enhance the current implementation to be able to actually add features enabled by driving/sinking pips in different tiles, all the features belonging to the same tile are now handled.

can you upload the failing DCP so I can take a peek at it?

Sure, here is a tarball containing also the netlists: counter_arty35t.tar.gz

@litghost litghost requested review from litghost and removed request for litghost April 9, 2021 22:29
@acomodi
Copy link
Contributor Author

acomodi commented Apr 12, 2021

@gatecat It seems that the BUFG route-through is the issue here and it is caused by the sink tile location-based lookahead results introduced here.

It is possible that, with no timing data, the sink location-based lookahead might not be suitable for non-so-regular tiles such as the clock resources tiles, which may cause some mispredictions of the lookahead, leading the router in the wrong direction. In this case a non-required BUFG route-through to route the clock from the clock PAD.

@gatecat
Copy link
Contributor

gatecat commented Apr 12, 2021

I personally think we need to special-case clock routing anyway, I will look into this.

But an extra BUFG route through on its own shouldn't cause problems so I suspect this is a FASM generator bug too - possibly around some of the global clock bits.

Does the bitstream from Vivado via the DCP route work? That'd rule out a problem on the nextpnr side.

@acomodi
Copy link
Contributor Author

acomodi commented Apr 12, 2021

Does the bitstream from Vivado via the DCP route work?

Currently, I am testing DCP -> bitstream only, so to keep the FASM generation out of the equation, and at the current master it does not work.

@gatecat
Copy link
Contributor

gatecat commented Apr 12, 2021

OK, that's a more interesting issue. I think the BUFG route through is probably not the root cause still, and I'll try and do some testing myself later today.

@acomodi
Copy link
Contributor Author

acomodi commented Apr 12, 2021

I think I have an idea of what is happening, but I need to double-check this.

Basically, when the route-through in the BUFG is used, the inverters on some of the BUFGCTRL pins are not being disabled/enabled. For instance, given that there is no cell representing the route-through, the Vivado has no information that this should be a BUFG (which correctly handles the CE, S, etc. pins) instead of a BUFGCTRL, hence the following bel pins are set to GND by default:

  • CE0
  • S0

Having them set to GND basically disables the BUFG, hence no clock is propagated.

@gatecat
Copy link
Contributor

gatecat commented Apr 12, 2021

Interesting - I thought Vivado was supposed to do this automatically when it sees a route-through pip; but maybe something about the interchange format is going wrong. Looking at the unpacked FASM from the Vivado bitstream, indeed the only feature for the lower, routed-through, BUFG is CLK_BUFG_TOP_R_X60Y53.BUFGCTRL.BUFGCTRL_X0Y0.IN_USE with none of the necessary ZINV bits set so it seems your theory is right.

I think for now the easiest hotfix might just be to blacklist the BUFG route-through pips during database generation.

@acomodi
Copy link
Contributor Author

acomodi commented Apr 12, 2021

I think for now the easiest hotfix might just be to blacklist the BUFG route-through pips during database generation.

You mean when populating the chip info for the nextpnr chipdb, right?

From a FASM generation point of view I think it would be quite straightforward to add the correct FASM features when the pseudo pip is used.

@gatecat
Copy link
Contributor

gatecat commented Apr 12, 2021

See #60 for a possible fix; I haven't tested on hardware but opened up the DCP and verified it wasn't using the BUFGCTRL route-through.

@gatecat
Copy link
Contributor

gatecat commented Apr 12, 2021

From a FASM generation point of view I think it would be quite straightforward to add the correct FASM features when the pseudo pip is used.

This doesn't fix the DCP route though - while that needs to be fixed too at some point (e.g. by creating an actual BUFGCTRL cell?) I think that's a longer term issue and this at least unbreaks builds.

@acomodi
Copy link
Contributor Author

acomodi commented Apr 12, 2021

This doesn't fix the DCP route though - while that needs to be fixed too at some point (e.g. by creating an actual BUFGCTRL cell?) I think that's a longer term issue and this at least unbreaks builds.

Indeed, given that the proper solution is not yet clear, I'll open an issue to include the information gathered so far, so we don't lose track of it.

@acomodi
Copy link
Contributor Author

acomodi commented Apr 12, 2021

See #60 for a possible fix; I haven't tested on hardware but opened up the DCP and verified it wasn't using the BUFGCTRL route-through.

Tested on HW and it properly works now.

At the moment, we need to still come up with a good enhancement of the
schema to support FASM features annotations to be added to the Device
Resources for a more data-driven approach.

With the current initial implementation, FASM features a quite fixed,
and, even though they can still be made more data-driven, some
specialization may be required to handle different devices families,
hence the adoption of a inheritance based generator, such that different
families can change the implementation or have more detailed functions
to add and deal with some possibly complex FASM annotations.

Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
This also adds some basic support for routing bels

Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
In addition reworked the site-through extra annotations to be more
easily expandable.

Added extra PIP features (mainly concerning clock resources) handling

Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
@acomodi
Copy link
Contributor Author

acomodi commented Apr 13, 2021

@gatecat I believe this PR is ready for another review pass. I still need to add some basic testing, but I have double-checked results produced with the current implementation against the tests in nextpnr, and the following tests work on HW:

  • counter
  • FF
  • lut
  • wire

This commit encapsulates two refactoring changes:

1. Addition of a LUT Mapper class which makes it easier to add FASM
features for both explicit-LUT-cells and LUT-thru wires
2. Better handling of PIPs extra features. Given that there are some
PIPs which, if enabled, enable a set of other features on other tiles,
the Node data needs to be accessed, which takes quite some time (~10
seconds for a xc7a35t device class).

Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
ExtraFeatures(
regex="CLK_BUFG_BUFGCTRL([0-9]+)_I[01]",
features=[
"IN_USE", "ZINV_CE0", "ZINV_S0", "IS_IGNORE1_INVERTED"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the features will need to be different depending on whether the I0/I1 route-thru is used (I know this isn't relevant ATM with BUFGCTRL route-thrus disabled but it will be important in the future).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a second regex with the other pin used and the corresponding features and added a TODO comment to better handle this case, as there is room for generalization I believe.

f = "{}{}".format(prefix, f)
self.add_cell_feature((tile, f))

# TODO: The FASM database should be reformatted so to have more
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed! I think U-Ray is a bit more consistent here already

Signed-off-by: Alessandro Comodi <acomodi@antmicro.com>
Copy link
Contributor

@gatecat gatecat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@acomodi
Copy link
Contributor Author

acomodi commented Apr 13, 2021

Given the size of the PR I believe we can merge it now and, as a follow-up PR I plan to introduce a testing framework for the FASM generation.

On a more general note, I believe that as soon as the nexus FASM generation is in place, we can begin thinking what and how we can generalize and add to the device database to make everything more data-driven.

@gatecat sounds good to you?

@gatecat
Copy link
Contributor

gatecat commented Apr 13, 2021

Yes, I'm happy for this to be merged now.

Dealing with the complexities about pip naming will be the biggest outstanding issue for nexus FASM, I'll have a think and write a bit more about that tomorrow.

@mithro
Copy link
Contributor

mithro commented Apr 13, 2021

At some point we had the idea of writing up a "FASM naming style guide". I'm unsure what ever happened with that...

@acomodi
Copy link
Contributor Author

acomodi commented Apr 14, 2021

At some point we had the idea of writing up a "FASM naming style guide". I'm unsure what ever happened with that...

@mithro The idea is still strongly valid, but it takes some time to create the style guide based on the upsides and downsides of the current naming conventions used in the FASM databases, and time also to apply it to the current FASM databases (xray in particular).

At this point of the development, we needed a working framework to get to the bit generation and be able to test everything in the whole flow, and, with the knowledge gathered by writing up the FASM generator, we can write a more consistent guide and rules that will make much more straightforward the FASM generation process itself.

I think that the next step at this point would be to start coming up with a draft of the styling guide and refine it. While also applying the corresponding changes to the FASM databases

@acomodi acomodi merged commit d295262 into chipsalliance:master Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants