Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using 400 AIE tiles through PLIO problem #292

Closed
CoffeeCat3008871 opened this issue Aug 24, 2022 · 6 comments
Closed

Using 400 AIE tiles through PLIO problem #292

CoffeeCat3008871 opened this issue Aug 24, 2022 · 6 comments

Comments

@CoffeeCat3008871
Copy link

I am trying to utilize all 400 AIE tiles to process the simple adder graph kernel. However, I encountered a problem shown at the bottom. For the adder kernel, it requires 2 input stream and 1 output stream. I therefore created 400 PLIO for input1 and input 2, and 400 output. There are 1200 PLIO created in total. However, it seems that the maximum number of IO I can create is 900. But the error message shows that using -ftemplate-depth can help but I had tried all place in makefile I could but nothing works so far. Reducing the number of PLIO to 100 for each with a total of 300 PLIO works well. Can anyone help with this problem?

src/aie/adder_x400.cpp:2414:48: required from here
/tools/Xilinx/Vitis/2022.1/aietools/include/adf/new_frontend/adf_api_impl.h:120:31: fatal error: template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum)
getPlatformIoAttrs(ioAttrOrFiles, ios...);
~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
ERROR: [aiecompiler 77-753] This application has discovered an exceptional condition from which it cannot recover while executing the following command

${XILINX_VITIS_AIETOOLS}/tps/lnx64/gcc/bin/g++ -std=c++17 -I . ./Work/temp/adder_x400.processed.ii -o ./Work/temp/adder_x400.out -L /tools/Xilinx/Vitis/2022.1/aietools/lib/lnx64.o -g -O0 -Wl,--unresolved-symbols=ignore-all -Wno-return-stack-address -Wno-missing-declarations -lmeir_frontend -ladf_api_frontend .
Please check the output log for errors and fix those before you run the application.
(WARNING:0, CRITICAL-WARNING:0, ERROR:1)
/tools/Xilinx/Vitis/2022.1/aietools/bin/aiecompiler: line 83: kill: (-22912) - No such process
makefile_versal_ps.mk:141: recipe for target 'libadf.a' failed
make: *** [libadf.a] Error 255

@OTremois
Copy link
Contributor

Hi, I am Olivier from AMD AECG. Can you share the tutorial you used?
Thanks
Olivier

@CoffeeCat3008871
Copy link
Author

Hi, I am Olivier from AMD AECG. Can you share the tutorial you used? Thanks Olivier

Hi, Olivier, thank you for your help. I used this tutorial as a base framework.
https://github.com/Xilinx/Vitis-Tutorials/tree/2022.1/AI_Engine_Development/Design_Tutorials/08-n-body-simulator

The difference is that I did not use the technique of Explicit Packet Switching since in the simple adder example, there is only two input stream.

@OTremois
Copy link
Contributor

If you do not use packet switching mechanism then you are limited to 312 input streams (39 available columns x 8 streams (64 bits) per column, which is further re-arranged inside the AI Engine array into 50 available columns x 6 streams(300)), and 234 outputs (39 columns x 6 streams per column, coming from 50 columns x 4 streams (200) inside the array).
So you can implement a maximum of 150 simple adders (I would say a little less as routing to the PL may be difficult) without using packet switching mechanism.
In order to use the full 400 AI Engines you nee to use packet switching to share inputs and output among the 4 kernels of the nbodySubsystem graph.

@CoffeeCat3008871
Copy link
Author

That's very helpful. Thank you a lot!

@amd-jamstine
Copy link

does anyone have a packet switching example to look at to address using all the 400 AI engines?

CRTejaswi pushed a commit to CRTejaswi/amd-vitis that referenced this issue Oct 3, 2023
09f1967 Merge pull request Xilinx#313 from dbee/resampler-static-assert
60bd8eb Updated copyright tags
928c6c2 remove redundant code/comments
e7c1b4f Fix resampler static assert for floats
4b7905a Merge pull request Xilinx#311 from changg/add_L1_metadata
42c50d3 Merge pull request Xilinx#312 from uvimalku/docs_update2
4488144 Add link to constraints
586bfbd add L1 metadata for PL
5d02f14 Merge pull request Xilinx#309 from mlechtan/next
884290c Fixing docs links & labels
7e4d9bc Merge pull request Xilinx#308 from mlechtan/next
148b658 Updating api.json and graph's doxygen description.
2cd2000 Merge pull request Xilinx#307 from mlechtan/doc_update
e8e8c7d Merge branch 'next' into doc_update
228e2e1 Updating docs
ffef78a Merge pull request Xilinx#306 from uvimalku/docs_update2
ddef4fe Merge branch 'next' into docs_update2
cf77115 restructure rst docs
2ffd146 Merge pull request Xilinx#304 from leol/add-api-json
286bb2e Add comment for public API to extract api.json
efab4aa Merge pull request Xilinx#303 from mlechtan/next
bf3039a Update index.rst
d51401e Merge pull request Xilinx#301 from mlechtan/csv_update
9452794 Update L2 FFT benchmarks
484879d Merge pull request Xilinx#297 from mlechtan/fft_cases
68b463c Merge pull request Xilinx#299 from mlechtan/csv_update
e4ca146 Update L2 CSV benchmarks
88570b3 Updating max_memory setting for 64k FFT case
c07a701 Merge pull request Xilinx#298 from gordono/next
92b4610 fix for tap copy in testbench ADL-1093 associated
ca8eaba Modify table formatting
df9c7d2 Fixing 8k typo
10d996e Adding FFT cases up to 64k
a6376a4 Merge pull request Xilinx#295 from mlechtan/copyright_2022
4f522cb Merge pull request Xilinx#294 from mlechtan/next
3401efa Updating copyright year to 2022
ab9e5ed Fixing testcase typo
d500381 Regenerating makefiles
b405a27 Fix to FIRs static_assert re number of iterations NITER % 2
39e451f Fix for FIR Resampler's streaming arch init stage.
f0be537 Fixes for FIRs streaming arch with cint32 data & cint32 coeffs.
65b68b1 Merge pull request Xilinx#292 from uvimalku/modify-fir-common-traits
45652dc Modify max tap length for fir_decimate_hb float/float
8e88476 Merge pull request Xilinx#288 from uvimalku/fir_sr_asym_updates
f217524 Merge pull request Xilinx#290 from gordono/next
285c5b7 Add files via upload
e60dbd9 Update dsp-lib-func.rst
b1a184b Merge pull request Xilinx#289 from uvimalku/fir_decimate_asym_constraints
d06e287 Merge pull request Xilinx#287 from mlechtan/next
b7f6f0c Add constraint for window size and tap length
9d6dbf5 Remove bug with larger tap lengths for cint16 coeff type
26515ee Docs update: filling testbench parameters.
5e3acff Docs update: DSP Library overview in the index.
bd5395f Merge pull request Xilinx#285 from mlechtan/next
2513cff Updating docs. Removing using-examples. Adding details and hopefully some clarity to dsp-lib-func.
2fe3b43 Merge pull request Xilinx#284 from mlechtan/next
ac22243 Updated api.json comments.
cbee8eb Merge pull request Xilinx#281 from mlechtan/next
dbfe9a0 Merge pull request Xilinx#282 from changg/fix_json
09868b0 fix json
357faab Merge branch 'FaaSApps:next' into next
fa1b201 Adding static_asert for incorrect iteration number when reloadable coeffs are used
0542b93 Add a testcase to USE_CUSTOM_CONSTRAINT
aa1e462 Add testbench option USE_CUSTOM_CONSTRAINT to overwrite defaults.
92e9ff5 Consolidate FIRs FIFO depth. - Add/expand access functions to set constraints.
ec01c57 Typo fix
9bde10d Merge pull request Xilinx#278 from mlechtan/next
d527be9 Resampler bug fixes
f1cc195 Merge pull request Xilinx#280 from changg/mv_dep
1d1127d fix description.json
5ba08f4 FIFO depth updates: - add getInNet and getIn2Net access functions. - adding USE_CUSTOM_FIFO option to overwrite default calculation using the access functions.   Plus a testcase that exercises this option.
4bd22a9 Merge pull request Xilinx#277 from mlechtan/next
d2f36c3 FFT Doxygen update. Adding FFT Group that covers fft class and template specialization.
3507584 Merge pull request Xilinx#276 from FaaSApps/mlechtan-max_fir_update
c9370c1 Update dsp-lib-func.rst
4dc200c Update dsp-lib-func.rst
0a7e332 Update dsp-lib-func.rst
f2ba50f Merge pull request Xilinx#274 from gordono/next
812456c Update dsp-lib-func.rst
41fc16b Update benchmark.rst
4f49206 Merge pull request Xilinx#272 from gordono/next
b6ceee1 Update dsp-lib-func.rst
44cc998 Update release.rst
77ada6b Update dsp-lib-func.rst
4d38a9a Add files via upload
fb8b096 Merge pull request Xilinx#270 from gordono/next
0659fbe Update dsp-lib-func.rst
d87e1fb Merge pull request Xilinx#265 from yuanqian/next
89bfe8e Update dsp-lib-func.rst
c3e94ee Update dsp-lib-func.rst
782617e Merge pull request Xilinx#3 from FaaSApps/next
27f2bf8 Merge pull request Xilinx#266 from gordono/next
ad793a2 Update dsp-lib-func.rst
cd06293 remove email from Jenkinsfile:https://jira.xilinx.com/browse/CR-1124831
80cf3f0 Merge pull request Xilinx#186 from dbee/description-update
9b616f6 Merge pull request Xilinx#263 from gordono/next
a9e3d98 Update dsp-lib-func.rst
d624ce5 Update dsp-lib-func.rst
6dbde64 Merge pull request Xilinx#261 from mlechtan/next
126b271 Updating API reference calls to an updated FFT graph
7b023f2 Merge pull request Xilinx#260 from liyuanz/replace_cflags
a845675 replace cflags with clflags
9984a20 Merge pull request Xilinx#259 from mlechtan/api_l2
95745e2 Merge pull request Xilinx#258 from mlechtan/next
e41d620 Adding L2 api.json
ba5216c Merge branch 'FaaSApps:next' into next
a662f60 FFT graph updates. Consolidating on port_array usge. Tidy up, updating doxygen comments.
f86c2c2 Merge pull request Xilinx#257 from mlechtan/next
2e52ee6 Adding graph_utils
fe1cc93 Merge pull request Xilinx#254 from dbee/ssr-script-fix-for-int16
23a2b46 FIR's graph updates. Reworking conditional ports & arrays into a standarized manner. Consolidation on documenting template parameters for new features. Making unneccessary public members private. Tidy up, removing large sections of commented out code.
d67e5cd Update dsp-lib-func.rst
424433c Merge branch 'FaaSApps:next' into description-update
11328fe int16 fix
36a296f Merge pull request Xilinx#253 from uvimalku/diff_tolerance
7d73404 merge with latest
0f29d61 Added cc_tolerance to diff.tcl and related json file changes
bc634f1 Modify 1D FFT API  (Xilinx#199)
cc41c15 Merge pull request Xilinx#251 from gordono/new_fft_location_constraints
23b488a adding new fft location constraints
cd062c8 Merge pull request Xilinx#250 from mlechtan/next
ad8508e Update dsp-lib-func.rst
f375729 Fix for FFT performance. Introducing secondary Radix4 pointer to ease compiler pipelining.
b791ab5 Update dsp-lib-func.rst
69d9dba Update dsp-lib-func.rst
a3b01cd Merge pull request #2 from FaaSApps/next
1e1fba1 Merge pull request Xilinx#249 from dbee/mat-mult-fixes
0d67c66 migrate from perforce
c225563 Merge pull request Xilinx#247 from dbee/Adding-tests
93ce47b Oops, changed the wrong json file
f167cb9 Merge pull request Xilinx#248 from gordono/next
25fdf11 Merge branch 'FaaSApps:next' into next
875f437 Add a higher factor fractional decimation rate
eb8ab62 Merge pull request Xilinx#245 from mlechtan/next
dc28578 Merge pull request Xilinx#246 from dbee/SSR-bugfix
797b972 Update index.rst
2d48bab Update conf.py
f8dd2e9 fixes for correct number of samples in SSR configs
5d80254 Merge pull request Xilinx#14 from mlechtan/remove_batch_results
0d80bab Update dsp-lib-func.rst
bbcb33e Delete L2/tests/aie/batch_results directory
2e936e4 FFT consolidation on ssr_split_zip script usage
27509fa Update api-reference.rst
93e3782 Update release.rst
84def6a Merge pull request #1 from FaaSApps/next
418800e Merge pull request Xilinx#244 from mlechtan/next
8265c9e Tweak for "large" fir testcase
bd4673c Consolidating FIR testcases.
904922c Adding UPSHIFT_CT to status file
befa514 Performance optimization with fixed register allocation
6c39d9e Fixing reference models interleave pattern, replacing multiple template specializations with std::conditionals.
2ede086 add ssr testcase back in
4d59816 Makefile changes for SSR param to be sent correctly
1f24739 Merge pull request Xilinx#239 from dbee/resampler-updates
9d2348c Remove duplicate defines
60f0dd5 Remove duplicate defines
8f56466 Update windowed cases to use reasonable window size
e85bddc Merge pull request Xilinx#14 from FaaSApps/just-merge
21e62a7 Merge branch 'resampler-updates' into just-merge
160b153 Merge pull request Xilinx#240 from mlechtan/next
09f75f9 Regenerate Makefiles to pick up latest description.json updates
dc8b174 Consolidating pre/post launch steps to use ssr_split_zip script. General tidy up in description.json.
880d71f Moving FIRs from simulation::platform to PLIO
0754e7b Merge pull request Xilinx#238 from mlechtan/next
f5cd22f Replacing conditional port classes with std::conditional
cc9e230 Makefile changes for resampler
6c16947 Adding deprecation notice
e16863e Swapping widget's input interleave pattern to 128-bit
3548832 Deprecation warning for fractional interpolator
d7d177e multiple data types with wndows and PLIO usage
842f4f3 Merge pull request Xilinx#237 from mlechtan/next
d738bec Correcting reference model x86sim params
b9ef36b Adding ssr_split_zip to description.json.
a474093 Correcting x86sim output directories
663a68a Moving widget real2complex testbench to PLIO
58cad46 Moving widget to PLIO, plus description.json tidy up.
9f76490 Fix for multiple window output clones
5042c51 Merge pull request Xilinx#236 from mlechtan/next
276d835 Fixing x86sim output directory references
3d0db64 Updating x86sim with ssr_split_zip.pl
dd57faf Merge branch 'next' of https://gitenterprise.xilinx.com/mlechtan/xf_dsp into next
d3bb9cb Updating description.json with ssr_split_zip.pl script
5798311 Fixing undefined UUT_SSR
fde989c Removing testcase that times out.
49f5aa2 Updating Makefiles with a fix for make all
e7e5873 Updating gitignore with Work dirs.
f103c17 Merge branch 'FaaSApps:next' into next
43eb65a Reducing single rate FIR coeff register usage slightly.
a0ee243 Replace C restrict with C++ __restrict. Consolidation on inline. Tidy up
462b568 Remocing obsolete file.
a6f9948 Fix for FFT's x86sim with multi-kernel designs.
ff77188 Adding type support for DDS/Mixer
538d5ac Merge pull request Xilinx#234 from liyuanz/replace_blacklist
d6a2ecd Merge pull request Xilinx#232 from mlechtan/next
9bbdf31 Merge pull request Xilinx#235 from uvimalku/Streaming_FIR_Decimate_Sym
235b6ae Added Streaming Interface to Fir Decimate Sym
51b2d03 Merge pull request Xilinx#233 from uvimalku/Streaming_Fir_Interpolate_Hb
a70b033 replace whiltelist/blacklist to allowlist/blocklist
9485bb9 Modified Reference Model
c7ac751 Remove PORT_AI template parameter for USE_CHAIN=1
8a6f469 Merge pull request Xilinx#221 from changg/fix_versal_trade
c62d086 Merge pull request Xilinx#231 from mlechtan/example_makefile
5f87b58 Fix merge typo
427a554 Defining graph's iteration number, instead of relying on compiler option.
6786e21 FIR Resampler streaming architecture support for multi-kernel designs.
617a94b Adding Vitis compliant description.json and Makefile
6412a74 Merge pull request Xilinx#228 from liyuanz/next
5a10251 increase time
b882e76 Added Streaming Interface to Fir Interpolate Hb
a80694a Merge pull request Xilinx#226 from uvimalku/next
4c50021 Added Tests for Streaming Interface to Fir Interpolate Asym
7a59062 Added Streaming Interface to Fir Interpolate Asym
33404f7 Merge pull request Xilinx#225 from mlechtan/next
4728288 Removing misconstructed sim_option
fb4027e Merge pull request Xilinx#12 from mlechtan/description_update_adl_722
2a9363a Merge branch 'next' into description_update_adl_722
30df1bf Adding extra parameters to Makefile, plus tidy up. Removing obsolete options.
6dafb64 Merge pull request Xilinx#223 from dbee/fir_sr_asym-dual-stream-bugfix
3f5112c Merge pull request Xilinx#224 from uvimalku/next
378f0c1 Stop using max function which is problematic vs tcl version
2a4235b Bug fixes regarding dual stream in/output
fbbf5c5 Added Streaming Interfaces to Fir Decimate Hb
198e330 Merge pull request Xilinx#222 from dbee/fir_sr_asym-ssr-updates
4f30e50 avoid permissions issues on split_zip by calling perl
7d13bc7 DDS Platform change and FIR SSR convention naming change (dual ports)
d1b6209 Update script calls with SSR and have defult SSR on all FIRs
afd903f add ssr testcase
85d93ec Intended makefile changes for SSR
e095968 Initial updates direct from perforce - without makefile enable yet
5aace4d Merge pull request Xilinx#13 from FaaSApps/next
33eb2cc replace Versal |trade| with Versal |reg| in the RST file
cdfadd9 Merge pull request Xilinx#218 from mlechtan/next
53b9738 Removing "main:noodle.optim.olbb=20"  from compilation. Plus, regen Makefile.
084f734 Adding --stop-on-deadlock option to x86sim run.
aa65af3 Fix for PLIO input gen when data type > 32-bit
c76c3af Adding FIR resampler streaming architecture.
5ff150c Merge pull request Xilinx#216 from changg/fix_versal_hwemu
59cb206 fix
44b10ef Align the way of using disble_auto_rewind config (Xilinx#215)
4d43ba2 Merge pull request Xilinx#213 from liyuanz/next
c5b8f64 update Makefile
1722e3f Merge pull request Xilinx#212 from liyuanz/replace_targets
2350610 update targes
e5aab14 Merge pull request Xilinx#211 from mlechtan/next
509da5d Tidy up of tcl scripts.
ddd6f4d Merge pull request Xilinx#11 from FaaSApps/next
b8af65d Merge pull request Xilinx#209 from mlechtan/next
3738669 Regen Makefile
bf5cd08 Extending stack allocation to accomodate FIR lengths up to 2k.
7650c6a Adding support architecture for streaming interfaces to FIR Resampler.
c35dafb Merge pull request Xilinx#208 from mlechtan/next
ec856bb Merge pull request Xilinx#11 from FaaSApps/next
9bce9fe Fixing clang-formatting
32f891c Adding PATTERN parameter to widget tests, plus a new testcase.
a599c22 Merge pull request Xilinx#207 from mlechtan/next
7e2f50f Updating gen_input's argument list with data type distinction
f9e4108 Merge branch 'next' of https://gitenterprise.xilinx.com/mlechtan/xf_dsp into next
139124c Removing obsolete --device option. Regenerating Makefiles + utils.mk
c60355a Merge pull request Xilinx#203 from mlechtan/next
48863ed Merge branch 'next' into next
5f243ee Typo fix
5fbc0a5 Merge pull request Xilinx#204 from mlechtan/fir_decimate_asym
7fae79d Fixing args list passed to pre/post build stages.
fdd5bf4 Switch input generation to 32-bit PLIO format
e60f0fa Merge branch 'fir_decimate_asym' of https://gitenterprise.xilinx.com/mlechtan/xf_dsp into fir_decimate_asym
d91beb7 Adding input file parameters
5904744 Temporarily re-adding until Resampler FIR merged
35a6d15 Typo fix
7c0abd8 Adding streaming architecture to FIR Decimate Asym
a23d4dd Replacing location constraint define macro with constexpr condition.
9993dd3 Adding API_IO and PARALLEL_POWER parameters. Updating FFT TB to use PLIO class, instead of depreceted simulation::platform.
46d9213 Streaming write/read functions overloads added with references to low-level intrinsics, to allow parallel operations on 2 streams. Widget updates to use the paralleled functions.
f1bf044 Correcting r2comb twiddle table size. Changing SSR format out of FFT to sample-wise Addition of multiple frames per window in SSR. Also, rejig of ref model to use samplewise ssr.
86891b3 update mk for aws board farm running (Xilinx#202)
f609d21 remove redundant test case in 1D-FFT (Xilinx#196)
873c6d6 Merge pull request Xilinx#193 from mlechtan/next
f6cfb8c Adding testcases for recently added features.
e74a746 Fix for CR-1117090 (Xilinx#192)
55820ee Merge pull request Xilinx#190 from mlechtan/next
69b963c Adding PORT_API define macro default.
11067ab Performance boost for stream architecture of single rate FIR. Adding support for cint32/cint32.
74caa4d Adding define macro for inline/noinline.
e602429 Extending supported range of FIR length up to 8k (max length depends on data/coeff combo and window size)., Available on single rate FIRs and Decimators.
65285d0 Merge pull request Xilinx#9 from FaaSApps/next
3bf60f0 Fix interface pragma typos (Xilinx#189)
679be2c Removing xchss noodle olbb option, to fix "no valid patterns for 'store(WSSMEM_tlast[t01u], w32)'"
1fbe017 Fix fir single rate symmetric FIR multiple output streams.
40959bb Fix for single rate symmetric FIR simulation hang with reloadable coeffs.
1d77f3b Merge pull request Xilinx#10 from dbee/next
0a6b029 update file source list to include kernel source code for caseFilter
75b2924 Merge pull request Xilinx#9 from FaaSApps/next
5741502 Merge pull request Xilinx#185 from dbee/dds_mixer_ssr
13cf02e change 2021.2_stable_latest to 2022.1_stable_latest
327bc39 update harvesting scripts for DDS SSR
b600834 adding tests to use new features
c37337f Fix for "no valid patterns for 'store(WSSMEM_tlast[t01u], w32)'"
37c91ab intended makefile changes
bbba3a1 ssr changes initial commit
09890d4 Merge pull request Xilinx#8 from FaaSApps/next

Co-authored-by: sdausr <sdausr@xilinx.com>
CRTejaswi pushed a commit to CRTejaswi/amd-vitis that referenced this issue Oct 3, 2023
6dfe46d Merge pull request Xilinx#344 from changg/fix_cr1128852
90e6901 fix CR-1128852
8d372c8 Merge pull request Xilinx#342 from Zhenhong/next
415b846 Merge pull request Xilinx#341 from sibow/next
4f1a8a2 change platform for L2 and L1
e7fd42b add result check
76ef1f8 update overview
34d4e0b update overview
75b958d Merge pull request Xilinx#337 from Zhenhong/next
5ece73d update tutorial
b51d442 Merge pull request Xilinx#336 from Zhenhong/next
c69f5e8 update readme and release notes
ed1f6d8 Merge pull request Xilinx#335 from Zhenhong/next
f92f99a fix wrong connectivity (Xilinx#334)
58e50b7 disable vck190 platform in regression/linearRegressionSGDTrain and clustering/kmeans due to timming issue
b3d348a Merge pull request Xilinx#332 from Zhenhong/next
a9a863d Merge remote-tracking branch 'upstream/next' into next
db29e29 update Makefile to fix missing LD_LIBRARY_PATH in hw issue(CR-1128335)
a311d95 Merge pull request Xilinx#317 from shengl/next-tutorial
a3cb814 Merge pull request Xilinx#330 from shengl/next-fix-CR-1127528
b1a5fe2 Update demo_start.py
07c8b6e Update demo_start.py
ffdff1e fix golden value check
ae2248e �
8ff83eb Fix u50build crs (Xilinx#329)
b648d94 fix hw build and host error (Xilinx#328)
be919b6 fix error in gradient calculation (Xilinx#327)
e0f566a Merge pull request Xilinx#323 from shengl/next-revert
70f5d4f Merge pull request Xilinx#325 from shengl/next-fix-CR-1127528
48dca5d Merge pull request Xilinx#321 from sibow/next
5273d05 fix cr
ef1cd05 add explaination for number of csv parser
d632218 revert utils.mk
edb864e Merge pull request Xilinx#322 from Zhenhong/next
95bd3e7 knn documentations
2394aef add description for template parameter api.json
023f27c Merge pull request Xilinx#320 from Zhenhong/next
91508cd Merge pull request Xilinx#316 from shengl/next-strtree-doc
0a310c8 fix typo error in api description
64749d3 Merge pull request Xilinx#319 from shengl/next-revert
17c11ee Merge pull request Xilinx#318 from Zhenhong/next
4a3b507 not support auto update Makefile, fix CR-1127263
45ef55d revert Makefile (cannot auto update)
6ba3c88 clang_format
b616e26 Merge remote-tracking branch 'upstream/next' into next
c201b4c remove etl info
669e7d3 add strtree doc
d615946 clarify the csv scanner kernel
33bdfd2 Merge pull request Xilinx#314 from yuanqian/update
809422e Merge pull request Xilinx#309 from shengl/next-tutorial
7501cf3 fix spell errors
3971884 update tutorial.rst
ef4521d update Makefile
46a4e0a Merge pull request Xilinx#313 from tuol/fix_cr_1122544
f2da6d5 update tutorial.rst
471d729 update connectivity
1cf1121 update connectivity
85e3273 Merge pull request Xilinx#312 from leol/fix-seg-fault
2cfe7e8 Fix seg-fault issue in regex (L3 SC)
6ed24d7 Merge pull request Xilinx#311 from yuanqian/next
528502f Merge pull request Xilinx#310 from shengl/next-fix-cr-22p1
e7900ac update description and Makefile
f81cde6 Fix bugs caused by changes in u250 resource distribution
570869a remove email from Jenkinsfile:https://jira.xilinx.com/browse/CR-1124831
c9ce4cb Merge remote-tracking branch 'upstream/next' into next
c4f681d Revert "remove email from Jenkinsfile:https://jira.xilinx.com/browse/CR-1124831"
205d4ba remove email from Jenkinsfile:https://jira.xilinx.com/browse/CR-1124831
80828e7 fix cr, not meet timing
1e6c956 add tutorial.rst for library
9d16cfd replace cflags with clflags (Xilinx#308)
f9271b9 Merge pull request Xilinx#278 from shengl/next-geospatial
2bdf9bf update Makefile
b87a182 move swig & arrow build file to ext folder
b2eb439 Merge pull request Xilinx#306 from Zhenhong/next
0b2fad4 fix CR-1122218
8417203 conv fix (Xilinx#305)
a583923 Merge pull request Xilinx#304 from Zhenhong/next
fdeaea8 decrease clock frequency from 300MHz to 250MHz
23fab12 fix design not meet timing issue CR-1123916
f9dc2c9 reduce instance to fix partially-conflicted nets issue CR-1123132
94957b6 modify size
ca88eab Merge pull request Xilinx#301 from liyuanz/add_time
343bb50 modfiy description.json
ded4f92 add dat folder
a5a06ba adjust file location
16eb6fc remove cmake install
f0799d7 Merge pull request Xilinx#302 from sibow/next
5bc5688 fix code bug
53d611d free buffer
92c937f fix hw_emu bug
5035336 fix hw_emu build bug
736cba4 multithread execute data preparation and kernel computation
8b38898 re-run
bd029af add time
96cfe05 Merge branch 'next' into next-geospatial
bb391ad add database to Jenkinsfile
697e10a change to load-balance; fix 4/8 pu hang issue.
249622a fix swig install bug in jks
1338bc1 add namespace
ce27c16 test jks env
4a31247 test jks
0c6f5ba test jks env
e4c8f84 update
154dae3 update env
e23ad93 install swig
c36c578 update
60e69e7 optimize gcc
0e7170a Merge pull request Xilinx#299 from sibow/next
48eaddb Merge pull request Xilinx#300 from liyuanz/increase
cf41987 Merge pull request Xilinx#295 from leol/fix-CR-1122134
224cb86 add geospatial case with supporting system compiler
1cca005 update cmodel case
977f483 increase time
cf95f3c support index in kernel
177d7d2 fix bug in handling '\n'
18e5404 switch python
e0b2afb replace whiltelist/blacklist to allowlist/blocklist (Xilinx#298)
25e9717 add memory/time for mem/time limit cases (Xilinx#297)
4c5d118 add memory (Xilinx#296)
52ca339 Fix for linking error found in daily regression
29a7c28 Merge pull request Xilinx#289 from sibow/next
5437f3d split file into blocks
8317d90 add support for u2
ff354b8 rm L1 implementations
a04014e Merge branch 'next' of gitenterprise.xilinx.com:FaaSApps/xf_DataAnalytics into next
ecebfb6 add L2 kernel, change platform setting
813d32b Merge pull request Xilinx#293 from changg/fix_scs
fbec276 Merge pull request Xilinx#291 from leol/smartssd-m1
0349d34 increase time (Xilinx#292)
0409239 update L1 input point
57cefc7 Switch ref-code from master to next as they are the same
02a90ba change graph branch
bd9f735 update L1 case include path
7c6dfd8 update upstream dependency
0183f2b clang format
51e2354 Merge branch 'knn_dev' into next Add knn L1/L2 apis
1ec4da8 change test data
6d344a8 add test data
3632375 change preprocess mode
9b36766 sort ascending
4d11300 Fix the csim/cosim mismatch as uninitilized mem (Xilinx#287)
5f746c7 fix makefiles
96a0ea2 fix --sp tag issue for Gradient_Boosted_Tree_Regression (Xilinx#288)
cd3ba53 2pu hw validated
e3c2c5d Merge pull request Xilinx#281 from leol/smartssd-m1
289b241 update Makefile and utils.mk (Xilinx#284)
7dced4c Update upstream dependencies in Jenkinsfile for using xf_compression/xf_security primitives
259e131 Rename files + add namespace
8bce1c5 Re-org/add namespace/polishing for L3 Gunzip + CSV
c6ea1b7 update targes (Xilinx#285)
0ae7da4 pass hls cosim; pass sw_emu
014fc78 Revise Copyright for df_utils.hpp in L1
99dc30f Split CSV parser of Samsung PoC Alpha as overloaded primitive to L1
064bff9 cosim pass: add array size, stream depth and flp
518b0c3 Remove redundant C_MODEL flow files + Add comments
2ba5f61 Copyright for all files ralated to Gunzip+CSV
e40f678 Regression test settings for gunzip+csv
2df3733 Clang format for both 3.9.0 and 8.0.0
7ca6cbd Regenerate Makefile using updated sc-makefile-gen
dc8ce36 draft metadata files (Xilinx#282)
bc859ff Remove C & OCL flows in Gunzip+CSV
5da5a7a Clang format Gunzip+CSV kernel files
3f08666 Clang format Gunzip+CSV related files
4681ffa Fix the open file out-of-bound issue in Gunzip+CSV
41ce5bf Add Q6 test to host
9dda3d3 Gunzip+CSV parser from SmartSSD M1
537063e add index in schema, avoid output out-of-order issue.
8e5f0f6 add distance and insert top k
7712df1 csv parser with demo data; csim pass
969e42d Merge pull request Xilinx#279 from changg/aws-support
389ece4 fix parquet write
33faf31 aws support
7734c35 standardization
e5cce67 update Makefile
de9ef99 add python
1ca474c gcc path
0dd03ab gcc
cafaa4e update arrow_env.sh
2a21f10 Specify g++ version
56c2045 update Makefile
558e50a update Makefile
3233b2c update Makefie
83489d9 fix json bug
a6986e1 fix json bug
8a3037f modify Makefile
d716c73 add test date
b16d689 optimize contains_test
ccfac42 add first version of contains
43e86fb Add fix for CR-1115640 (Xilinx#272)
84d8b26 Merge pull request Xilinx#271 from xingw/json_parser
97aa786 change 2021.2_stable_latest to 2022.1_stable_latest
15998a2 Code clean
b1c61a3 Clean the code
ddb8329 Cosim passed with 2/4/8 PU setting
9897a93 Correct the pragma
00463f1 Add json parser API, CSIM passed
415f4fc Clean the unused structs
a99edd0 add benchmark in subpage after move benchmark.rst to parent dir
242a872 fix https://jira.xilinx.com/browse/CR-1101226
af4255a Merge remote-tracking branch 'upstream/next' into next sync
3bcc4ef Merge remote-tracking branch 'upstream/next' into next sync
859bdfd Merge remote-tracking branch 'upstream/next' into next sync
70f0997 Merge pull request #1 from yuanqian/benchmark_1
a359309 fix conflict
ef5761e Merge pull request Xilinx#19 from tuol/next
2a306c9 update Jenkinsfile and description.json
2a6c4a4 Merge pull request Xilinx#15 from tuol/next

Co-authored-by: sdausr <sdausr@xilinx.com>
CRTejaswi pushed a commit to CRTejaswi/amd-vitis that referenced this issue Oct 3, 2023
4eff91b Merge pull request Xilinx#292 from tuol/update_dm_name
9b0299c update
e8b5b5f Merge pull request Xilinx#291 from tuol/add_perf_doc
ca2c56a add perf data of 4d DM
f5d6bc9 Merge pull request Xilinx#290 from tuol/fix_dm_doc
4d5056e update doc for 4D datamover
1dc9899 Merge pull request Xilinx#289 from RepoOps/update_readme_5
0e4c644 update
0fc8870 update README
32b471c update README
4ed37e1 Merge pull request Xilinx#288 from tuol/cr_1143159
eeb9c9e add freq constrain
c6adb44 Merge pull request Xilinx#287 from tuol/fix_4d
4a65513 add missing utils.mk
b5a1c14 Merge pull request Xilinx#286 from tuol/fix_doc_1010
c633178 add missing images
569364b complement rst doc
0ac615d Merge pull request Xilinx#285 from tuol/4d_datamover_l2_case
4d0e9ae add 4D data mover L2 case
398dc24 Merge pull request Xilinx#284 from tuol/cr_1141439
08a1ce1 Merge pull request Xilinx#283 from RepoOps/update_doc_url_3
0ab1937 slightly update L2/script/makefile
7ab5ed0 update url and branch in doc
5c6bfef Merge pull request Xilinx#282 from tuol/fix_doc_version_3
73b7358 update version to 2022.2
6910294 Merge pull request Xilinx#281 from tuol/doc_fix
8595a85 add release note
f8d18dd Merge pull request Xilinx#280 from RepoOps/update_makefile_20220908-212141
7d7643c update Makefile with 2.0.8 version

Co-authored-by: sdausr <sdausr@xilinx.com>
CRTejaswi pushed a commit to CRTejaswi/amd-vitis that referenced this issue Oct 3, 2023
7395a97 Merge pull request Xilinx#297 from RepoOps/update_readme_5
324d8be update README
664d352 Merge pull request Xilinx#296 from RepoOps/update_doc_url_3
f6b6b56 fix url
2761b83 Merge pull request Xilinx#295 from tianminr/CR_1143009
7f0d8ef reduce tareget freq
1b0157b Merge pull request Xilinx#293 from yunleiz/fnext
7f57dc9 [cr] fixed CR-1142398
8c246c8 Merge pull request Xilinx#292 from RepoOps/update_doc_url_2
aee5e30 Merge pull request Xilinx#291 from yunleiz/fnext
6bfaa7d update url and branch in doc
f251b54 Merge branch 'next' of gitenterprise.xilinx.com:FaaSApps/xf_codec into fnext
81c849a [CR] fixed GUI cr 1139093
b24efc8 Merge pull request Xilinx#290 from yuxiangz/sp
639c4b6 Merge pull request Xilinx#289 from liyuanz/next
83fadfe fixed json for sp
6677112 update
625d879 update
ac2ee7c Merge pull request Xilinx#286 from RepoOps/update_makefile_20220908-212141
57dc831 fix bug
b96dfd3 update Makefile with 2.0.8 version

Co-authored-by: sdausr <sdausr@xilinx.com>
@Albresky
Copy link

Albresky commented Nov 2, 2024

If you do not use packet switching mechanism then you are limited to 312 input streams (39 available columns x 8 streams (64 bits) per column, which is further re-arranged inside the AI Engine array into 50 available columns x 6 streams(300)), and 234 outputs (39 columns x 6 streams per column, coming from 50 columns x 4 streams (200) inside the array). So you can implement a maximum of 150 simple adders (I would say a little less as routing to the PL may be difficult) without using packet switching mechanism. In order to use the full 400 AI Engines you nee to use packet switching to share inputs and output among the 4 kernels of the nbodySubsystem graph.


Hi, Olivier, I've encountered a PLIO related issues as well.

As known to all, there are 39 PLIO ports available. In detail, in 64bit data width, 8 channels are available for each PLIO port from PL to AIEs and 6 channels from AIEs to PL. So, when the datawidth expands from 64bit to 128bit, there should be $64/128 \times 8 \times 39$=156 PLIO channels in total from PL to AIEs.

When I use aiecompiler in Vitis 2021.1 to compile this graph, a graph with 88 input PLIOs and 33 ouput PLIOs both in 128bit could be compiled successfully. But, when I used aiecompiler in Vitis 2023.2 and later(so far, Vitis 2024.1), the aiecompiler failed with "ERROR: [AIE-PRE-MAPPER-11] area group ((0, 0), (49, 0)) has capacity 78 PLIO 128 bit incoming registered channels".

Is this a tool bug? I've seen several users encounted this issue but no response was given from AMD/Xilinx official.

Could you help to explain how PLIOs work under different datawidth? Many thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants