-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exceeds buffer capacity #23
Comments
Factors are multiplicatively cumulative, so to determine the tile size at the Global buffer I'll need to know the factors at all levels inside of the Global buffer as well. |
@angshuman-parashar Thanks. Below are the factors of the global buffer. Is that what we need to compute the tile size?
|
No that's not enough. As you can see, that's storage level #4. I need to know factors for levels 0, 1, 2, 3 as well - the product of all of those factors will give you the tile size at level 4. Perhaps that explains why your buffer is overflowing? |
@angshuman-parashar Thanks! What are the equations behind this? For example, is the tile size for level 0 the product of all factors (R, S, P, Q, C, K, N)? For upper levels, could you show me the equation to compute the tile size based on the lower levels and itself? I'm putting all the factors below. Thanks!
|
First calculate each dimension as the product of all factors. E.g., multiplying over all levels (temporal + spatial) from 0 through 4, we get: R = 7, S=7, P=112, Q=8, C=1, K=64, N=1. This gives us the problem- or iteration-space tile at level 4. Next, project this problem-space into the data-spaces (i.e., tensors) to obtain the tile shapes for those spaces. E.g., weights = R*S*C*K = 3,136, outputs = N*K*Q*P = 57,344 and inputs = N*C*(S+(Q-1)*Hstride)*(R+(P-1)*Wstride) = 4,809 (assuming dilation=1), giving us a total of 65,289 entries. You can multiply that by the word size to get the capacity in bytes. Now I'm curious, because it doesn't match the error message (unless I messed up the math somewhere above). Could you please email or upload the entire .cfg (arch, mapping, everything) so that I can reproduce at my end? |
@angshuman-parashar Thanks for doing the computation! I really appreciate it. The error for this set of parameters below is
|
@angshuman-parashar Another question is, I assume that the permutation will not affect the tile size, is it true? Further, I guess that only those non-one factors will count in the permutation in terms of performance implications. For example, if I have |
Re. your earlier question: Look at the bypass settings. Weights are being bypassed at that level. 65289 - 62153 = 3136, which is the weight tile :). Re. your most recent question: Correct, permutation does not affect size. And correct, permutations of only non-unit factors affect performance/energy efficiency. In fact, this is something that the mapper exploits to prune the search space. |
@angshuman-parashar Thanks! It makes a lot of sense. I really appreciate it! |
Hi @aleczhanshi and @angshuman-parashar : I am facing a similar issue while trying to convert the mapper output map.txt file to .yaml format for the timeloop-model. I am specifically working on the tutorial example: For the mapping given in ref-output:
However when I run:
|
Hi there,
As I'm playing with different configurations, I've run into
ERROR: couldn't map level GlobalBuffer: mapped tile size 428201 exceeds buffer capacity 65536
. I've been trying to look into the codebase to figure out what it happens, but it seems a bit hard to figure this out through the code.Could you briefly explain (hopefully in math) how the mapped tile size, buffer capacity are computed from the problem shape (RSPQCKN) and arch specs (sizeKB, entries, word-bits, instances, etc.).
Here are my
problem shape =
(R = 7; S = 7; P = 112; Q = 112; C = 3; K = 64; N = 1; Wstride = 2; Hstride = 2;)
factors =
("R1 S1 P112 Q1 C1 K1 N1")
and arch spec
(sizeKB = 128; instances = 1; meshX = 1; word-bits = 16; block-size = 4; read_bandwidth = 16; write_bandwidth = 16;)
Thanks in advance!
The text was updated successfully, but these errors were encountered: