Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android app - Error - Attempted to resize a static tensor to a new shape at dimension 0 #1350

Open
adonnini opened this issue Dec 5, 2023 · 54 comments
Assignees
Labels
bug Something isn't working module: kernels Issues related to kernel libraries, e.g. portable kernels and optimized kernels module: xnnpack Issues related to xnnpack delegation triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@adonnini
Copy link

adonnini commented Dec 5, 2023

My Android application fails with Attempted to resize a static tensor to a new shape at dimension 0 error. Please find the full logcat below.

The shape of input datasets in my model is not static. Specifically, the number of steps in any one sequence varies.

here is the code I use to define the input dataset for the model in the Android application:

			float[] flat = flatten(tmpData);
			final long[]  shapeArrDataPytorchFlattened = new long[]{tmpData.length, 4, 1};
			arrDataPytorch = Tensor.fromBlob(flat, shapeArrDataPytorchFlattened);

where 4 is the number of features and tmpData.length is the size of the input dataset (with n rows and 4 columns)

here is the code I use to run inference:

			try {
				Log.i(TAG, " - neuralNetworkloadAndRunPytorch - Abut to run inference --- ");
				outputTensor = mModule.forward(from(arrDataPytorch)).toTensor();
			} catch (Exception e) {
				Log.i(TAG, " - neuralNetworkloadAndRunPytorch - Inference FAILED --- ");
				throw new RuntimeException(e);
			}

when I run inference on my model processed with torchscript and processed using pytorch mobile. I produce the input dataset as follows:

		        final long[] shapeArrDataPytorchFlattened = new long[]{1, flat.length};   //USED FOR PYTORCH MOBILE
			arrDataPytorch = Tensor.fromBlob(flat, shapeArrDataPytorchFlattened);

and run inference as follows:

				mModule = LiteModuleLoader.load(moduleFileAbsoluteFilePath);
				outputTensor = mModule.forward(IValue.from(arrDataPytorch)).toTensor();

This works producing reasonable results.

I would appreciate any thoughts as to what is causing the problem, and how I might go about fixing it.

Thanks

LOGCAT

12-05 16:48:49.983: I/NeuralNetworkService(16887):  - NeuralNetworkServiceRunnable - neuralNetworkInputPreparationRunning - 1 - 0
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - NeuralNetworkServiceRunnable - neuralNetworkLoadAndRunRunning - 0 - 0
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - NeuralNetworkServiceRunnable - About to run neuralNetworkloadAndRun --- 
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - Running - 
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - locationInformationDir - /data/user/0/com.android.contextq/files/locationInformation/
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - savedNetworkArchiveLength - 120669888
12-05 16:48:49.983: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - Abut to load module --- 
12-05 16:48:50.067: I/ETLOG(16887): Model file /data/user/0/com.android.contextq/files/locationInformation/tfmodel_exnnpack.pte is loaded.
12-05 16:48:50.067: I/ETLOG(16887): Setting up planned buffer 0, size 23366800.
12-05 16:48:50.077: W/libc(16887): Access denied finding property "ro.hardware.chipname"
12-05 16:48:50.078: W/adbd(13666): timeout expired while flushing socket, closing
12-05 16:48:50.080: D/XNNPACK(16887): allocated 6144 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.080: D/XNNPACK(16887): created workspace of size 774176
12-05 16:48:50.081: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.085: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.088: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.092: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.092: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.092: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.092: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.096: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.097: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.113: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.127: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.130: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.130: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.130: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.130: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.132: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.146: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.150: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.150: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.150: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.150: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.152: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.166: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.170: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.170: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.170: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.170: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.172: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.186: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.190: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.190: D/XNNPACK(16887): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
12-05 16:48:50.190: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.190: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.192: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.206: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.209: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.213: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.217: D/XNNPACK(16887): allocated 8192 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.217: D/XNNPACK(16887): created workspace of size 1327136
12-05 16:48:50.217: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.221: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.224: D/XNNPACK(16887): reusing tensor id #8 memory for tensor id #5 Node #2 Softmax
12-05 16:48:50.224: D/XNNPACK(16887): created workspace of size 42368
12-05 16:48:50.225: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.225: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.229: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.232: I/XNNPACK(16887): fuse Clamp Node #2 into upstream Node #1
12-05 16:48:50.234: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.249: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.263: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.269: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.273: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.276: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.277: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.281: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.284: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.286: D/StNfcHal(979): (#0C838) Rx 60 07 01 e2 
12-05 16:48:50.286: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.301: D/XNNPACK(16887): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.315: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.319: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.322: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.323: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.327: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.331: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.334: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.338: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.338: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.342: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.344: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.357: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.358: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.361: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.362: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.365: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.367: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.381: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.381: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.385: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.385: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.389: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.390: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.404: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.405: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.408: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.409: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.412: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.414: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.427: D/XNNPACK(16887): created workspace of size 663584
12-05 16:48:50.428: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.431: D/XNNPACK(16887): created workspace of size 387104
12-05 16:48:50.432: D/XNNPACK(16887): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.435: I/XNNPACK(16887): fuse Clamp Node #1 into upstream Node #0
12-05 16:48:50.437: D/XNNPACK(16887): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.450: D/XNNPACK(16887): allocated 16416 bytes for packed weights in Fully Connected (NC, F32) operator
12-05 16:48:50.467: I/NeuralNetworkService(16887):  - neuralNetworkloadAndRunPytorch - Abut to run inference --- 
12-05 16:48:50.467: I/ETLOG(16887): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 12716
12-05 16:48:50.467: I/ETLOG(16887): Error setting input 0: 0x10
12-05 16:48:50.467: I/ETLOG(16887): In function forward(), assert failed: set_input_status == Error::Ok
12-05 16:48:50.467: A/libc(16887): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 16905 (Thread-2), pid 16887 (lNetworkService)
12-05 16:48:50.635: I/crash_dump64(17226): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto

@adonnini adonnini changed the title Android app - Attempted to resize a static tensor to a new shape at dimension 0 Android app - Error - Attempted to resize a static tensor to a new shape at dimension 0 Dec 6, 2023
@cccclai cccclai added module: xnnpack Issues related to xnnpack delegation and removed module: xnnpack Issues related to xnnpack delegation labels Dec 6, 2023
@cccclai
Copy link
Contributor

cccclai commented Dec 6, 2023

The difference between Lite Interpreter (Pytorch Mobile) and ExecuTorch is that, in Executorch, we plan memory ahead of time, which can help us to re-use/reduce memory usage in runtime. What is the dynamic part in the origin pytorch model? Can the dynamic part be upper bound?

Reference doc: https://pytorch.org/executorch/stable/compiler-memory-planning.html

@adonnini
Copy link
Author

adonnini commented Dec 6, 2023

@cccclai Thanks for the response. Please bear with me as I am a beginner with Pytorch (my background is java and Android application development).

I did not create the model I am using (https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master). I modified slightly to accommodate my datasets (produced by my Android application).

When you ask about the dynamic part of the model, could you please clarify? Dynamic with respect to which variables/parts of the model?

The model is an attention-based Transformer Network. Depending on what you mean by dynamic, being an encoder/decoder model the input from one part of the model to another is dynamic.

I read the document you referenced in your message. Would it make sense for me to use the
alloc_graph_input=False option? If I did, it's not clear what I should do in my application to do this:
"If the IO is not planned then users will be expected to provide data buffers to back these values at runtime"
Did I misunderstand the documentation?

Thanks

@kimishpatel
Copy link
Contributor

If the IO is not planned then users will be expected to provide data buffers to back these values at runtime

What this means is that the inputs you provide will be a) copied to the memory planned for it during memory planning pass if IO was part of memory planning OR b) will not be copied since memory planning did not plan for this.

By default IO is planned and hence if follow this, https://github.com/pytorch/executorch/blob/main/examples/demo-apps/android/jni/jni_layer.cpp#L345, you will see that the output returned from the executor is references directly and there is a comment on the lifetime of the pointer referenced by the output tensor.

Now with respect to dynamic size. THere are a couple of things.

  1. Export + executorch support upper bound dynamic size. That is you can tag the input tensors to have bounded dynamic size. E.g. On each dimension of the input what is the max size that dim can take. This helps export understand what is the maximum size expected on input, output and intermediate tensors. You can read more here https://pytorch.org/docs/stable/export.html#expressing-dynamism and @JacobSzwejbka is an expert on this who can give more input.
  2. Do you know what part of your model is dynamic? You may not know this and thats fine. In that case I presume, you just expect that "I should be able to supply input of varying size", right?
  3. Shape dynamism gets bit more complicated if the model you are interested in is also lowered to some delegate like say XNNPACK. In that case the delegate also need to support dynamic shape. For XNNPACK for example, this is not the case yet, so wont quite be able to run model via XNNPACK that can take inputs of different shapes.

@kimishpatel kimishpatel added the need-user-input The issue needs more information from the reporter before moving forward label Dec 7, 2023
@adonnini
Copy link
Author

adonnini commented Dec 7, 2023

@kimishpatel Thanks!

a) It looks like I may have a problem as XNNPACK does not support shape dynamism as input sequences in the model I use are of varying length.

b) Your assumption is correct. I expected to be able to provide input of varying size

c) Given what @cccclai said in his comment regarding the difference between pytorch mobile and executorch, and also based on what you said in your point 1. above, I will try the torch.export.dynamic_dim() API.

d) Why is delegation (e.g. via XNNPACK) necessary in order to lower a model onto an edge device using executorch? Sorry for bringing it up again. With pytorch mobile it was not (unless I am mistaken). Is there an alternative to using XNNPACK?

Thanks

@cbilgin cbilgin added the feature A request for a proper, new feature. label Dec 11, 2023
@cccclai
Copy link
Contributor

cccclai commented Dec 11, 2023

Why is delegation (e.g. via XNNPACK) necessary in order to lower a model onto an edge device using executorch?

Delegation is for delegate part of or the whole model to some powerful backends on device. Different edge devices may have different backends. XNNPACK(https://github.com/google/XNNPACK) is the one of the most powerful backends on CPU. For example, on iOS there will be some other powerful backends (https://github.com/pytorch/executorch/tree/main/backends/apple) like coreml and mps. Qualcomm chipset might too.

In PyTorch Mobile, XNNPACK is pretty much like a default backend and it runs after we call optimized_for_mobile. In PyTorch Mobile, we also have limited backends available on PyTorch Mobile like coreml (https://pytorch.org/tutorials/prototype/ios_coreml_workflow.html)

@adonnini
Copy link
Author

@cccclai Thanks.
When I used pytorch mobile I chose to not optimize before lowering the model to the edge device. Pytorch mobile gave me that option, if I am not mistaken.
Executorch does not give you the option to not optimize, right?
By the way, I chose to not use optimization in order to first get the basic mechanism working (i.e. perform inference on the edge device successfully).

@SS-JIA SS-JIA added bug Something isn't working and removed feature A request for a proper, new feature. labels Dec 14, 2023
@SS-JIA SS-JIA self-assigned this Dec 14, 2023
@SS-JIA SS-JIA added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 14, 2023
@SS-JIA
Copy link
Contributor

SS-JIA commented Dec 15, 2023

@adonnini to provide some context, in PyTorch mobile, the optimize_for_mobile step essentially applies a pre-defined set of transformation passes on the TorchScript model to optimize it for a specific processor (CPU, GPU, etc.).

With ExecuTorch, this process is a bit more involved. Essentially a model is represented by default using the Edge IR. However, since XNNPACK is a powerful library, we provide a delegate which will consume the Edge IR and convert the model to XNNPACK's representation. The converted graph can then be executed using XNNPACK.

Essentially, ExecuTorch provides more control over how your model executes.

Regarding your initial issue, would you mind sharing how you produced your model? As mentioned before, XNNPACK doesn't support dynamic shapes.

@adonnini
Copy link
Author

@SS-JIA Here is a link to the model I use:
https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master
I modified it slightly to work with my dataset. I also added executorch code to train.py. I use train.py to produce the model. For the time being, I commetend out the validation code.
Is this the information you were looking for?

@adonnini
Copy link
Author

adonnini commented Jan 15, 2024

Hi, after fixing dynamic_dim error (#1379) with @angelayi greatly appreciated help, I tried once again to use the model for inference in my Android app.

Unfortunately the result was once again a Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 11406 error as described earlier in this issue's thread. Below, you will find the latest traceback log.

Please let me know if you need me do do anything and what I should do next.

TRACEBACK LOG (IT'S LONG. i INCLUDED ALL MESSAGES RELATED TO THE FAILURE RATHER THAN ASSUMING WHAT IS RELEVANT)

01-15 16:08:10.386: I/ETLOG(12852): Model file /data/user/0/com.android.contextq/files/locationInformation/TptDelegate.pte is loaded.
01-15 16:08:10.386: I/ETLOG(12852): Setting up planned buffer 0, size 31460272.
01-15 16:08:10.400: W/libc(12852): Access denied finding property "ro.hardware.chipname"
01-15 16:08:10.422: D/XNNPACK(12852): allocated 6144 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.423: D/XNNPACK(12852): created workspace of size 774176
01-15 16:08:10.424: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.442: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.461: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.479: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.479: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.479: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.480: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.501: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.575: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.597: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.603: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.603: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.603: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.604: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.621: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.622: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.622: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.623: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.641: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.641: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.641: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.642: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.659: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.660: D/XNNPACK(12852): reusing tensor id #4 memory for tensor id #3 Node #2 Softmax
01-15 16:08:10.660: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.661: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.666: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.672: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.676: D/XNNPACK(12852): allocated 8192 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.676: D/XNNPACK(12852): created workspace of size 1327136
01-15 16:08:10.677: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.683: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.687: D/XNNPACK(12852): reusing tensor id #8 memory for tensor id #5 Node #2 Softmax
01-15 16:08:10.687: D/XNNPACK(12852): created workspace of size 42368
01-15 16:08:10.687: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.688: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.692: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.697: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.765: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.815: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.821: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.828: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.832: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.832: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.837: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.842: D/XNNPACK(12852): allocated 4202496 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.861: D/XNNPACK(12852): allocated 4196352 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.867: D/StNfcHal(1024): (#007DF) Rx 6f 02 0a (hidden)
01-15 16:08:10.880: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.899: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.916: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.917: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.926: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.933: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.938: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.943: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.944: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.950: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.951: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.957: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.958: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.963: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.963: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.968: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.969: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.974: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.974: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.979: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.979: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.984: D/XNNPACK(12852): created workspace of size 663584
01-15 16:08:10.984: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.989: D/XNNPACK(12852): created workspace of size 387104
01-15 16:08:10.990: D/XNNPACK(12852): allocated 1050624 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:10.994: D/XNNPACK(12852): allocated 16416 bytes for packed weights in Fully Connected (NC, F32) operator
01-15 16:08:11.012: I/NeuralNetworkService(12852):  - neuralNetworkloadAndRunPytorch - Abut to run inference --- 
01-15 16:08:11.012: I/ETLOG(12852): Attempted to resize a static tensor to a new shape at dimension 0 old_size: 27 new_size: 11406
01-15 16:08:11.013: I/ETLOG(12852): Error setting input 0: 0x10
01-15 16:08:11.013: I/ETLOG(12852): In function forward(), assert failed: set_input_status == Error::Ok
01-15 16:08:11.013: A/libc(12852): Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 12870 (Thread-2), pid 12852 (lNetworkService)
01-15 16:08:11.088: I/crash_dump64(13916): obtaining output fd from tombstoned, type: kDebuggerdTombstoneProto
01-15 16:08:11.089: I/tombstoned(719): received crash request for pid 12870
01-15 16:08:11.089: I/crash_dump64(13916): performing dump of process 12852 (target tid = 12870)
01-15 16:08:11.390: A/DEBUG(13916): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-15 16:08:11.390: A/DEBUG(13916): Build fingerprint: 'Fairphone/FP4eea/FP4:13/TKQ1.230127.002/TP20:user/release-keys'
01-15 16:08:11.390: A/DEBUG(13916): Revision: '0'
01-15 16:08:11.390: A/DEBUG(13916): ABI: 'arm64'
01-15 16:08:11.390: A/DEBUG(13916): Timestamp: 2024-01-15 16:08:11.106599870+0100
01-15 16:08:11.390: A/DEBUG(13916): Process uptime: 381s
01-15 16:08:11.390: A/DEBUG(13916): Cmdline: com.android.contextq:ContextQNeuralNetworkService
01-15 16:08:11.390: A/DEBUG(13916): pid: 12852, tid: 12870, name: Thread-2  >>> com.android.contextq:ContextQNeuralNetworkService <<<
01-15 16:08:11.390: A/DEBUG(13916): uid: 10207
01-15 16:08:11.390: A/DEBUG(13916): signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
01-15 16:08:11.390: A/DEBUG(13916):     x0  0000000000000000  x1  0000000000003246  x2  0000000000000006  x3  00000072fe287f50
01-15 16:08:11.390: I/LocationChangeManagement(5959):  - lastKnownLocation - Last known location - lastKnownLocationString - Location[Provider= network, lat= 45.753405, lon= 8.312165, acc= 12,  t= 1705331289499, et= 3625844163303, alt= 722.2000122070312, vel= -1.00, bear= -1.00, {Bundle[{networkLocationType=wifi}]}]
01-15 16:08:11.390: A/DEBUG(13916):     x4  60651f7371647272  x5  60651f7371647272  x6  60651f7371647272  x7  7f7f7f7f7f7f7f7f
01-15 16:08:11.390: A/DEBUG(13916):     x8  00000000000000f0  x9  0000007681304b28  x10 0000000000000001  x11 000000768134484c
01-15 16:08:11.390: A/DEBUG(13916):     x12 00000072fe286520  x13 0000000000000044  x14 00000072fe287868  x15 0000000034155555
01-15 16:08:11.390: A/DEBUG(13916):     x16 00000076813acd68  x17 00000076813884e0  x18 00000072fd7a2000  x19 0000000000003234
01-15 16:08:11.390: A/DEBUG(13916):     x20 0000000000003246  x21 00000000ffffffff  x22 000000769380e9d8  x23 000000769380e9d8
01-15 16:08:11.390: A/DEBUG(13916):     x24 00000072fe2885f0  x25 b4000074c9870870  x26 0000000000000002  x27 0000007693abc378
01-15 16:08:11.390: A/DEBUG(13916):     x28 00000072fe2884c0  x29 00000072fe287fd0
01-15 16:08:11.390: A/DEBUG(13916):     lr  0000007681335788  sp  00000072fe287f30  pc  00000076813357b4  pst 0000000000001000
01-15 16:08:11.390: A/DEBUG(13916): backtrace:
01-15 16:08:11.390: A/DEBUG(13916):       #00 pc 00000000000527b4  /apex/com.android.runtime/lib64/bionic/libc.so (abort+168) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
01-15 16:08:11.390: A/DEBUG(13916):       #01 pc 0000000000b95590  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (et_pal_abort+8) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #02 pc 0000000000b95398  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (torch::executor::runtime_abort()+8) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #03 pc 0000000000b72dac  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)+596) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #04 pc 0000000000b73384  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::MethodWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (executorch_jni::ExecuTorchJni::*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>), &(executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)), executorch_jni::ExecuTorchJni, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::dispatch(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&)+236) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #05 pc 0000000000b7ba64  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::CallWithJniConversions<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&), facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&))+96) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #06 pc 0000000000b731b4  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::FunctionWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&), facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(_JNIEnv*, _jobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::HybridClass<executorch_jni::ExecuTorchJni, facebook::jni::detail::BaseHybridClass>::JavaPart, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>&&))+64) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #07 pc 0000000000b6a754  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/base.apk!libexecutorchdemo.so (facebook::jni::detail::MethodWrapper<facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator> (executorch_jni::ExecuTorchJni::*)(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>), &(executorch_jni::ExecuTorchJni::forward(facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*>)), executorch_jni::ExecuTorchJni, facebook::jni::basic_strong_ref<executorch_jni::JEValue, facebook::jni::LocalReferenceAllocator>, facebook::jni::alias_ref<facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*> >::call(_JNIEnv*, _jobject*, facebook::jni::detail::JTypeFor<facebook::jni::JArrayClass<facebook::jni::detail::JTypeFor<executorch_jni::JEValue, facebook::jni::JObject, void>::_javaobject*>, facebook::jni::detail::JTypeArray, void>::_javaobject*)+44) (BuildId: 8065dc692f8e345f80fe49a1f2162d7e784b3499)
01-15 16:08:11.390: A/DEBUG(13916):       #08 pc 0000000000355830  /apex/com.android.art/lib64/libart.so (art_quick_generic_jni_trampoline+144) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.390: A/DEBUG(13916):       #09 pc 000000000033eda4  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.390: A/DEBUG(13916):       #10 pc 0000000000511050  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+1976) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #11 pc 0000000000498288  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+4716) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #12 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #13 pc 0000000000a29dd8  /data/app/~~fViNWQBOJr2R6-BoTC9BtQ==/com.android.contextq-P3yFbp1b-styp-fCS4BJRA==/oat/arm64/base.vdex (com.example.executorchdemo.executor.Module.forward+0)
01-15 16:08:11.391: A/DEBUG(13916):       #14 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #15 pc 0000000000511d1c  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #16 pc 00000000004973dc  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+960) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #17 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #18 pc 000000000000d4fc  /data/data/com.android.contextq/code_cache/.overlay/base.apk/classes15.dex (com.android.contextq.neuralnetwork.NeuralNetworkService.neuralNetworkloadAndRunPytorch+0)
01-15 16:08:11.391: A/DEBUG(13916):       #19 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #20 pc 0000000000511d1c  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #21 pc 000000000049774c  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+1840) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #22 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #23 pc 0000000000007d44  /data/data/com.android.contextq/code_cache/.overlay/base.apk/classes15.dex (com.android.contextq.neuralnetwork.NeuralNetworkService$NeuralNetworkServiceRunnable.run+0)
01-15 16:08:11.391: A/DEBUG(13916):       #24 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #25 pc 0000000000511d1c  /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, bool, art::JValue*)+5252) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #26 pc 0000000000498288  /apex/com.android.art/lib64/libart.so (void art::interpreter::ExecuteSwitchImplCpp<false>(art::interpreter::SwitchImplContext*)+4716) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #27 pc 0000000000357fd8  /apex/com.android.art/lib64/libart.so (ExecuteSwitchImplAsm+8) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #28 pc 000000000000308c  [anon:dalvik-/apex/com.android.art/javalib/core-oj.jar-transformed] (java.lang.Thread.run+0)
01-15 16:08:11.391: A/DEBUG(13916):       #29 pc 0000000000374120  /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.__uniq.112435418011751916792819755956732575238.llvm.420609892041422114)+232) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #30 pc 0000000000373a18  /apex/com.android.art/lib64/libart.so (artQuickToInterpreterBridge+964) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #31 pc 0000000000355968  /apex/com.android.art/lib64/libart.so (art_quick_to_interpreter_bridge+88) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #32 pc 000000000033eda4  /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+612) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #33 pc 0000000000239d54  /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+144) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #34 pc 000000000053a1b0  /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+1600) (BuildId: 735f12f804f88d62a2cb437261076ff7)
01-15 16:08:11.391: A/DEBUG(13916):       #35 pc 00000000000ba650  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)
01-15 16:08:11.391: A/DEBUG(13916):       #36 pc 0000000000053ffc  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+68) (BuildId: bf5f1ce73f89cca7d6a062eb7877e86a)

@adonnini
Copy link
Author

adonnini commented Jan 16, 2024

@SS-JIA in your comment above you state:
"XNNPACK doesn't support dynamic shapes."
yet, @cccclai in his comment above states:
"In PyTorch Mobile, XNNPACK is pretty much like a default backend and it runs after we call optimized_for_mobile."

My model does use dynamic shapes, and I was able to run it for inference successfully from my Android application using the PyTorch Mobile (skipping the optimization step) runtime engine.

If I was able to run my model successfully using PyTorch Mobile because I skipped the optimization step, then why is there not a way to skip optimization when using Executorch? This would seem to be a reasonable option to have.

As far as I know, models with dynamic shapes are not the exception. How will it be possible (when?) to run models with dynamic shapes on Android devices using the Executorch runtime engine?

If the answer to both questions above is negative, Then it looks like I will not be able to use Executorch for my models. That would be really too bad.

Please let me know if I misunderstood your comment and if I am missing something.

Thanks

@adonnini
Copy link
Author

@SS-JIA would it help if I sent you the .pte file produced when training my model using executorch?

Also, Here is a link to the model I use:
https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master
I modified it slightly to work with my dataset. I also added executorch code to train.py. I use train.py to produce the model. For the time being, I commetend out the validation code. If you like I can send you the modified train.py I used to produce the .pte file.

I hope you will have the time to let me know how I should proceed. Thanks

@kimishpatel
Copy link
Contributor

@mcr229 can you take a look and see the dynamic shape support issue in xnnpack

@guangy10 guangy10 added module: xnnpack Issues related to xnnpack delegation module: kernels Issues related to kernel libraries, e.g. portable kernels and optimized kernels and removed need-user-input The issue needs more information from the reporter before moving forward labels Jan 30, 2024
@mcr229
Copy link
Contributor

mcr229 commented Feb 14, 2024

HI @adonnini XNNPACK Delegate currently can only support taking in inputs with static shapes. We are actively working on upstreaming dynamic shape support to XNNPACK, and once finished, we will be able to leverage this by updating our XNNPACK commit.

@adonnini
Copy link
Author

adonnini commented Feb 14, 2024 via email

@mcr229
Copy link
Contributor

mcr229 commented Feb 14, 2024

We expect to have this ready within the next two weeks.

@adonnini
Copy link
Author

adonnini commented Feb 14, 2024 via email

@mcr229
Copy link
Contributor

mcr229 commented May 24, 2024

@adonnini I believe ExecuTorch does upper bounded memory planning and I know that XNNPACK delegate does as well. While I'm not entirely sure how executorch will do with very large max values with respect to memory planning. The XNNPACK delegated portions will use the upper bound to do initial memory planning. XNNPACK will actually be able to go above the maximum value, however this will come at the cost of some performance as we will reallocate memory for the new larger amount at that inference. I fear using too large a maximum value for XNNPACK as it may throw errors with not enough memory as it tries to allocate memory for extremely large intermediate tensors. So I would say to try to put the most realistic maximum tensor size.

cc. @JacobSzwejbka, @cccclai, @larryliu0820 for the dynamic memory planning

@adonnini
Copy link
Author

adonnini commented May 26, 2024

@mcr229 after adding the dynamic shapes code, execution failed producing the traceback log reported below. For your reference, below you will also find the code that produced the failure.

The executorch related code is inserted in the training epoch loop. It runs after execution of a training step which based on the logs ran successfully. I am pointing this out because I find this line in the traceback log puzzling (Not surprising):

torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['enc_input'].size()[1] = 7 is not equal to L['dec_input'].size()[1] = 12

enc_input and dec_input in the model can be but are not expected to be equal.

Here is a print of their shapes:

 - train_minimum - Lowering the Whole Module - enc_input.shape -  torch.Size([27, 7, 2])
 - train_minimum - Lowering the Whole Module - dec_input.shape -  torch.Size([27, 12, 3])
 - train_minimum - Lowering the Whole Module - dec_source_mask.shape -  torch.Size([27, 1, 7])
 - train_minimum - Lowering the Whole Module - dec_target_mask.shape -  torch.Size([27, 12, 12])

Probably, I just don't understand the error statement.

Please let me know what I should do next, and if you need any additional information.

Thanks

TRACEBACK LOG

E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Error while creating guard:
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0] Name: ''
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Source: shape_env
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Create Function: SHAPE_ENV
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Guard Types: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Code List: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Object Weakref: None
E0525 03:16:25.806026 140644507187008 torch/_guards.py:251] [0/0]     Guarded Class Weakref: None
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0] Created at:
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 482, in transform
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     tracer = InstructionTranslator(
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2060, in __init__
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     output=OutputGraph(
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 310, in __init__
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     self.init_ambient_guards()
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]   File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/output_graph.py", line 421, in init_ambient_guards
E0525 03:16:25.806714 140644507187008 torch/_guards.py:253] [0/0]     self.guards.add(ShapeEnvSource().make_guard(GuardBuilder.SHAPE_ENV))
  0%|                                                                                                                                              | 0/5 [00:25<?, ?it/s]
Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train-minimum.py", line 438, in <module>
    pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/__init__.py", line 151, in capture_pre_autograd_graph
    m = torch._dynamo.export(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1354, in inner
    raise constraint_violation_error
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1311, in inner
    result_traced = opt_f(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
    return _compile(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 634, in compile_inner
    check_fn = CheckFunctionManager(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/guards.py", line 1048, in __init__
    guard.create(builder)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_guards.py", line 249, in create
    return self.create_fn(builder, self)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/guards.py", line 705, in SHAPE_ENV
    guards = output_graph.shape_env.produce_guards(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2946, in produce_guards
    raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['enc_input'].size()[1] = 7 is not equal to L['dec_input'].size()[1] = 12

CODE


        dim1_x = Dim("dim1_x", min=1, max=100000)
        dynamic_shapes = {"enc_input": {1: dim1_x}, "dec_input": {1: dim1_x}, "dec_source_mask": {1: dim1_x}, "dec_target_mask": {1: dim1_x}}

        pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
                                                               (enc_input, dec_input, dec_source_mask, dec_target_mask), dynamic_shapes=dynamic_shapes)
        aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect,

        print(" - train_minimum - Lowering the Whole Module - ATen Dialect Graph")
        print(" - train_minimum - Lowering the Whole Module - aten_dialect - ", aten_dialect)

        edge_program: EdgeProgramManager = to_edge(aten_dialect)
        to_be_lowered_module = edge_program.exported_program()

        from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend
        lowered_module = edge_program.to_backend(XnnpackPartitioner())

        print(" - train_minimum - Lowering the Whole Module - lowered_module - ", lowered_module)

        save_path = save_path = "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/models/tpt_delegate.pte"
        with open(save_path, "wb") as f:
            f.write(lowered_module.to_executorch().buffer)

@tarun292
Copy link
Contributor

@adonnini the statement means that a guard was generated during export that checks to ensure that L['enc_input'].size()[1] == L['dec_input'].size()[1]. In the dynamic range you have provided this constraint is being violated when L['enc_input'].size()[1] is 7 and L['dec_input'].size()[1] is 12. You can enable detailed logging to see which line of model source code generated this guard so that we can maybe potentially change the code or change the constraint range. To do this you can add to the top of the export script this code:

os.environ["TORCH_LOGS"] = "+dynamo"
torch._logging._init_logs()

In the logs search for "guard added" and you should be able to see which line of model source code generated this guard.

@adonnini
Copy link
Author

@tarun292 I will do as you ask and let you know what I find.
However, I am puzzled/perplexed by this statement:

guard was generated during export that checks to ensure that L['enc_input'].size()[1] == L['dec_input'].size()[1]

I don't understand why this check would be done/enabled in the first place since, unless I am mistaken, it is not the case that enc_input'].size()[1] and ['dec_input'].size()[1] be equal. Where does that requirement come from?
What am I missing / doing wrong?

Thanks

@tarun292
Copy link
Contributor

@adonnini before or after (near this log) there should be another print indicating which source line generated this guard. Are you able to see that?

@adonnini
Copy link
Author

adonnini commented May 28, 2024

@tarun292

I think I may have found the source of this particular problem. Please bear with me.

Beofre working with executorch I used torchscript to run the model for inference from my Android app using Pytorch Mobile.

In order to do that I had to change this line:

x = x + Variable(self.pe[:, :x.size(1)], requires_grad=False)
to
x = x + torch.tensor(self.pe[:, :x.size(1)], requires_grad=False)

After making this change, I was able to use torchscript to create a lowered model. Please note that model training and validation works equally well with either of the above lines enabled.

I think the guard error is caused by the above line (the one I had to change) since the error log reports that the error occurred at the line I changed (see above):

I0528 16:11:01.096204 140565612545856 torch/fx/experimental/symbolic_shapes.py:4035] [0/0] eval Ne(5000, s1) [guard added] at Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py:353 in forward (_dynamo/utils.py:1764 in run_node)
I0528 16:11:01.097931 140565612545856 torch/fx/experimental/symbolic_shapes.py:4035] [0/0] eval s1 <= 5000 [guard added] at Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py:353 in forward (_decomp/decompositions.py:756 in slice_forward)

To support this conclusion, if I change back to the original line:
x = x + Variable(self.pe[:, :x.size(1)], requires_grad=False)

Code execution fails producing the error log you will find below which is different from the one I reported in this issue.

Sorry about this. Please let me know if you have any questions, or need me to do anything else, and what I should do next.

Thanks

ERROR LOG guard added

I0528 16:24:03.115558 139740707194688 torch/fx/experimental/symbolic_shapes.py:4035] [0/0] eval Ne(5000, s0) [guard added] at Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py:355 in forward (_dynamo/utils.py:1764 in run_node)
I0528 16:24:03.117803 139740707194688 torch/fx/experimental/symbolic_shapes.py:4035] [0/0] eval s0 <= 5000 [guard added] at Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py:355 in forward (_decomp/decompositions.py:756 in slice_forward)


TRACEBACK LOG
-------------------------
Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train-minimum.py", line 446, in <module>
    pre_autograd_aten_dialect = capture_pre_autograd_graph(m,
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/__init__.py", line 151, in capture_pre_autograd_graph
    m = torch._dynamo.export(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1311, in inner
    result_traced = opt_f(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 451, in _fn
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 921, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state, skip=1)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 400, in _convert_frame_assert
    return _compile(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 676, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 262, in time_wrapper
    r = func(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 535, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/bytecode_transformation.py", line 1036, in transform_code_object
    transformations(instructions, code_options)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 165, in _fn
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 500, in transform
    tracer.run()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2149, in run
    super().run()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1219, in CALL_FUNCTION
    self.call_function(fn, args, {})
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 335, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 289, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/functions.py", line 90, in call_function
    return tx.inline_user_function_return(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 680, in inline_user_function_return
    return InliningInstructionTranslator.inline_call(self, fn, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2285, in inline_call
    return cls.inline_call_(parent, func, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 2399, in inline_call_
    tracer.run()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 810, in run
    and self.step()
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 773, in step
    getattr(self, inst.opname)(inst)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 489, in wrapper
    return inner_fn(self, inst)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 1272, in CALL_FUNCTION_KW
    self.call_function(fn, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/symbolic_convert.py", line 674, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/user_defined.py", line 418, in call_function
    return super().call_function(tx, args, kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/variables/base.py", line 349, in call_function
    unimplemented(f"call_function {self} {args} {kwargs}")
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_dynamo/exc.py", line 190, in unimplemented
    raise Unsupported(msg)
torch._dynamo.exc.Unsupported: call_function UserDefinedClassVariable(<class 'torch.autograd.variable.Variable'>) [TensorVariable()] {'requires_grad': ConstantVariable()}

from user code:
   File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py", line 489, in forward
    enc_embed = self.encoder_embedding.forward(enc_input)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py", line 400, in forward
    x = self.pos_encoding.forward(x)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/model.py", line 355, in forward
    x = x + Variable(self.pe[:, :x.size(1)], requires_grad=False)

@tarun292
Copy link
Contributor

tarun292 commented May 28, 2024

Nice that's a good sign. Now to get past this can you try replacing:

capture_pre_autograd_graph

with

from torch.export import _trace
_trace._export(
    model,
    inputs,
    dynamic_shapes=dynamic_shapes,
    pre_dispatch=True,
    strict=False
)

@adonnini
Copy link
Author

@tarun292 I did as you suggested.
Code execution fails producing the traceback log reported below. Below, I also include my executorch related code for your reference.
Last, below you will also find the rest of the log produced by the execution. It's not long.
It looks like we are back to the mismatched dimension 1 error. Please let me know if you need additional information.
Thanks

TRACEBACK LOG

 Traceback (most recent call last):
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 798, in _export
    range_constraints = make_constraints(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 215, in make_constraints
    raise constraint_violation_error
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 198, in make_constraints
    shape_env.produce_guards(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2946, in produce_guards
    raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['args'][0][0].size()[1] = 7 is not equal to L['args'][0][1].size()[1] = 12

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train-minimum.py", line 447, in <module>
    pre_autograd_aten_dialect = torch.export._trace._export(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 635, in wrapper
    raise e
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 618, in wrapper
    ep = fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/exported_program.py", line 83, in wrapper
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 805, in _export
    raise UserError(UserErrorType.CONSTRAINT_VIOLATION, str(e))  # noqa: TRY200
torch._dynamo.exc.UserError: L['args'][0][0].size()[1] = 7 is not equal to L['args'][0][1].size()[1] = 12

CODE

        from torch.export import _trace

        dim1_x = Dim("dim1_x", min=1, max=100000)
        dynamic_shapes = {"enc_input": {1: dim1_x}, "dec_input": {1: dim1_x}, "dec_source_mask": {1: dim1_x}, "dec_target_mask": {1: dim1_x}}

        pre_autograd_aten_dialect = torch.export._trace._export(
            m,
            (enc_input, dec_input, dec_source_mask, dec_target_mask),
            dynamic_shapes=dynamic_shapes,
            pre_dispatch=True,
            strict=False
        )

        aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect,
                                               (enc_input, dec_input, dec_source_mask, dec_target_mask), strict=False)

        edge_program: EdgeProgramManager = to_edge(aten_dialect)
        to_be_lowered_module = edge_program.exported_program()

        from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend

        lowered_module = edge_program.to_backend(XnnpackPartitioner())

        save_path = save_path = "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/models/tpt_delegate.pte"
        with open(save_path, "wb") as f:
            f.write(lowered_module.to_executorch().buffer)

EXECUTION LOG PRECEDING TRACEBACK LOG

 - train_minimum - Lowering the Whole Module - enc_input.shape -  torch.Size([27, 7, 2])
 - train_minimum - Lowering the Whole Module - dec_input.shape -  torch.Size([27, 12, 3])
 - train_minimum - Lowering the Whole Module - dec_source_mask.shape -  torch.Size([27, 1, 7])
 - train_minimum - Lowering the Whole Module - dec_target_mask.shape -  torch.Size([27, 12, 12])
V0529 00:23:12.396728 140157743589184 torch/fx/experimental/symbolic_shapes.py:1980] create_env
I0529 00:23:12.461530 140157743589184 torch/fx/experimental/symbolic_shapes.py:2724] create_symbol s0 = 7 for L['args'][0][0].size()[1] [2, 100000] (_export/non_strict_utils.py:82 in fakify)
V0529 00:23:12.462952 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval True == True [statically known]
V0529 00:23:12.463898 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval False == False [statically known]
I0529 00:23:12.466710 140157743589184 torch/fx/experimental/symbolic_shapes.py:2724] create_symbol s1 = 12 for L['args'][0][1].size()[1] [2, 100000] (_export/non_strict_utils.py:82 in fakify)
I0529 00:23:12.470231 140157743589184 torch/fx/experimental/symbolic_shapes.py:2724] create_symbol s2 = 12 for L['args'][0][3].size()[1] [2, 100000] (_export/non_strict_utils.py:82 in fakify)
V0529 00:23:12.648503 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s0, 1) == True [statically known]
V0529 00:23:12.648741 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval True == True [statically known]
V0529 00:23:12.650567 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval False == False [statically known]
V0529 00:23:12.652845 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(27*s0, 27) == False [statically known]
V0529 00:23:12.658359 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(Mod(27, 27*s0), 0) == True [statically known]
V0529 00:23:12.667733 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(27*s0, 27) == True [statically known]
V0529 00:23:12.670522 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(s0, 1) == False [statically known]
I0529 00:23:12.676598 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval Ne(5000, s0) [guard added] (export/_safeguard.py:42 in __torch_function__)
I0529 00:23:12.678930 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval s0 <= 5000 [guard added] (_decomp/decompositions.py:756 in slice_forward)
V0529 00:23:12.745145 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval 1 < s0 == True [statically known]
V0529 00:23:12.746409 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(512*s0, 512) == True [statically known]
V0529 00:23:12.769120 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s0**2, 1) == True [statically known]
V0529 00:23:12.795464 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(64*s0, 64) == False [statically known]
V0529 00:23:12.797819 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(13824*s0, 13824) == False [statically known]
V0529 00:23:12.798557 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(64*s0, 64) == True [statically known]
V0529 00:23:13.433873 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s1, 1) == True [statically known]
V0529 00:23:13.438142 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(27*s1, 27) == False [statically known]
V0529 00:23:13.440988 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(Mod(27, 27*s1), 0) == True [statically known]
V0529 00:23:13.451060 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(27*s1, 27) == True [statically known]
V0529 00:23:13.453818 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(s1, 1) == False [statically known]
I0529 00:23:13.457257 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval Ne(5000, s1) [guard added] (export/_safeguard.py:42 in __torch_function__)
I0529 00:23:13.459002 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval s1 <= 5000 [guard added] (_decomp/decompositions.py:756 in slice_forward)
V0529 00:23:13.490471 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(s2, 1) == False [statically known]
V0529 00:23:13.527318 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval 1 < s1 == True [statically known]
V0529 00:23:13.529191 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(512*s1, 512) == True [statically known]
V0529 00:23:13.549526 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s1**2, 1) == True [statically known]
V0529 00:23:13.553129 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s2, 1) == True [statically known]
I0529 00:23:13.559841 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (solve_backed) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:13.560526 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval Eq(s1, 12) [guard added] (_refs/__init__.py:403 in _broadcast_shapes)
I0529 00:23:13.561443 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:13.566483 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s2 = 12 (solve_backed) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:13.566708 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval Eq(12, s2) [guard added] (_refs/__init__.py:403 in _broadcast_shapes)
I0529 00:23:13.567889 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s2 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:13.568883 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:13.643140 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s0 = 7 (solve_backed) ValueRanges(lower=7, upper=7, is_bool=False)
I0529 00:23:13.643383 140157743589184 torch/fx/experimental/symbolic_shapes.py:4035] eval Eq(s0, 7) [guard added] (_refs/__init__.py:403 in _broadcast_shapes)
I0529 00:23:13.644494 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s0 = 7 (find) ValueRanges(lower=7, upper=7, is_bool=False)
I0529 00:23:13.671927 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s2 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:14.048779 140157743589184 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 00:23:14.857855 140157743589184 torch/fx/experimental/symbolic_shapes.py:2806] produce_guards
V0529 00:23:14.858145 140157743589184 torch/fx/experimental/symbolic_shapes.py:4119] eval False == False [statically known]

@tarun292
Copy link
Contributor

@adonnini i don't see the actual place where this guard would have been generated. Can you share the whole log output? Is it also possible to share the repro for this, or is that very hard?

@adonnini
Copy link
Author

@tarun292 Below you will find the entire log produced by the execution.

The repo (I think you meant repo not repro?) for the model is at
https://github.com/sharonrichushaji/trajectory-prediction-transformers/tree/master

I posted the executorch related code I use in my comment about under the heading "CODE"

Please let me know if you need additional information, or need me to do anything.

Thanks

EXECUTION LOG


(executorch_with_3.1) adonnini1@actlnxlptp6:~/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master$ python3 train-minimum.py
Value of eta max is: 0.0001
  0%|                                                                                                                                                                                  | 0/5 [00:00<?, ?it/s]Epoch 1/5....Training loss = 7.9188
 - train_minimum - Lowering the Whole Module - enc_input.shape -  torch.Size([27, 7, 2])
 - train_minimum - Lowering the Whole Module - dec_input.shape -  torch.Size([27, 12, 3])
 - train_minimum - Lowering the Whole Module - dec_source_mask.shape -  torch.Size([27, 1, 7])
 - train_minimum - Lowering the Whole Module - dec_target_mask.shape -  torch.Size([27, 12, 12])
V0529 15:46:33.920315 139994632947520 torch/fx/experimental/symbolic_shapes.py:1980] create_env
I0529 15:46:33.991929 139994632947520 torch/fx/experimental/symbolic_shapes.py:2724] create_symbol s0 = 7 for L['args'][0][0].size()[1] [2, 100000] (_export/non_strict_utils.py:82 in fakify)
V0529 15:46:33.993557 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval True == True [statically known]
V0529 15:46:33.994709 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval False == False [statically known]
I0529 15:46:33.997846 139994632947520 torch/fx/experimental/symbolic_shapes.py:2724] create_symbol s1 = 12 for L['args'][0][1].size()[1] [2, 100000] (_export/non_strict_utils.py:82 in fakify)
I0529 15:46:34.001861 139994632947520 torch/fx/experimental/symbolic_shapes.py:2724] create_symbol s2 = 12 for L['args'][0][3].size()[1] [2, 100000] (_export/non_strict_utils.py:82 in fakify)
V0529 15:46:34.178747 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s0, 1) == True [statically known]
V0529 15:46:34.178998 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval True == True [statically known]
V0529 15:46:34.181260 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval False == False [statically known]
V0529 15:46:34.183285 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(27*s0, 27) == False [statically known]
V0529 15:46:34.188770 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(Mod(27, 27*s0), 0) == True [statically known]
V0529 15:46:34.198517 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(27*s0, 27) == True [statically known]
V0529 15:46:34.201433 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(s0, 1) == False [statically known]
I0529 15:46:34.207859 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval Ne(5000, s0) [guard added] (export/_safeguard.py:42 in __torch_function__)
I0529 15:46:34.210131 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval s0 <= 5000 [guard added] (_decomp/decompositions.py:756 in slice_forward)
V0529 15:46:34.274716 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval 1 < s0 == True [statically known]
V0529 15:46:34.276211 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(512*s0, 512) == True [statically known]
V0529 15:46:34.300287 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s0**2, 1) == True [statically known]
V0529 15:46:34.328634 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(64*s0, 64) == False [statically known]
V0529 15:46:34.331195 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(13824*s0, 13824) == False [statically known]
V0529 15:46:34.332053 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(64*s0, 64) == True [statically known]
V0529 15:46:35.049913 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s1, 1) == True [statically known]
V0529 15:46:35.053488 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(27*s1, 27) == False [statically known]
V0529 15:46:35.056374 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(Mod(27, 27*s1), 0) == True [statically known]
V0529 15:46:35.064713 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(27*s1, 27) == True [statically known]
V0529 15:46:35.067324 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(s1, 1) == False [statically known]
I0529 15:46:35.070697 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval Ne(5000, s1) [guard added] (export/_safeguard.py:42 in __torch_function__)
I0529 15:46:35.072310 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval s1 <= 5000 [guard added] (_decomp/decompositions.py:756 in slice_forward)
V0529 15:46:35.103084 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Eq(s2, 1) == False [statically known]
V0529 15:46:35.138140 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval 1 < s1 == True [statically known]
V0529 15:46:35.139391 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(512*s1, 512) == True [statically known]
V0529 15:46:35.159959 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s1**2, 1) == True [statically known]
V0529 15:46:35.163385 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval Ne(s2, 1) == True [statically known]
I0529 15:46:35.168911 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (solve_backed) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:35.169559 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval Eq(s1, 12) [guard added] (_refs/__init__.py:403 in _broadcast_shapes)
I0529 15:46:35.170413 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:35.175276 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s2 = 12 (solve_backed) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:35.175462 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval Eq(12, s2) [guard added] (_refs/__init__.py:403 in _broadcast_shapes)
I0529 15:46:35.176541 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s2 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:35.177481 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:35.248831 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s0 = 7 (solve_backed) ValueRanges(lower=7, upper=7, is_bool=False)
I0529 15:46:35.249115 139994632947520 torch/fx/experimental/symbolic_shapes.py:4035] eval Eq(s0, 7) [guard added] (_refs/__init__.py:403 in _broadcast_shapes)
I0529 15:46:35.250222 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s0 = 7 (find) ValueRanges(lower=7, upper=7, is_bool=False)
I0529 15:46:35.275912 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s2 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:35.653853 139994632947520 torch/fx/experimental/symbolic_shapes.py:3809] set_replacement s1 = 12 (find) ValueRanges(lower=12, upper=12, is_bool=False)
I0529 15:46:36.375609 139994632947520 torch/fx/experimental/symbolic_shapes.py:2806] produce_guards
V0529 15:46:36.375891 139994632947520 torch/fx/experimental/symbolic_shapes.py:4119] eval False == False [statically known]
  0%|                                                                                                                                                                                  | 0/5 [00:28<?, ?it/s]
Traceback (most recent call last):
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 798, in _export
    range_constraints = make_constraints(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 215, in make_constraints
    raise constraint_violation_error
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 198, in make_constraints
    shape_env.produce_guards(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 2946, in produce_guards
    raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['args'][0][0].size()[1] = 7 is not equal to L['args'][0][1].size()[1] = 12

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train-minimum.py", line 447, in <module>
    pre_autograd_aten_dialect = torch.export._trace._export(
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 635, in wrapper
    raise e
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 618, in wrapper
    ep = fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/exported_program.py", line 83, in wrapper
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch_with_3.1/lib/python3.10/site-packages/torch/export/_trace.py", line 805, in _export
    raise UserError(UserErrorType.CONSTRAINT_VIOLATION, str(e))  # noqa: TRY200
torch._dynamo.exc.UserError: L['args'][0][0].size()[1] = 7 is not equal to L['args'][0][1].size()[1] = 12
I0529 15:46:36.377562 139994632947520 torch/_dynamo/utils.py:320] TorchDynamo compilation metrics:
I0529 15:46:36.377562 139994632947520 torch/_dynamo/utils.py:320] Function, Runtimes (s)
I0529 15:46:36.377562 139994632947520 torch/_dynamo/utils.py:320] create_aot_dispatcher_function, 2.3282
(executorch_with_3.1) adonnini1@actlnxlptp6:~/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master$ 

@tarun292
Copy link
Contributor

@adonnini i guess what i meant was is there a way we can repro this on our end? Would be faster to debug that way.

@adonnini
Copy link
Author

adonnini commented May 30, 2024

@tarun292 Your are right.

The reason I gave you the link to the repo of the model I am using was for you to add the exeutorch code I use to the train.py module and run the module as instructed in the repo's read,md page. I can send you my executorch related code again, tell you where I placed it, and let you know about a couple of other changes I made to train.py, or I could simply send you a copy of train.py, which I renamed train-minimum.py, I am using. Would this not work?

Another way of reproducing my set-up I can think of is for me to create a repo with the code and give you access to it.

Did you have something else (simpler) in mind?

Please let me know.
Thanks

@tarun292
Copy link
Contributor

@adonnini i think sending a copy of train-minimum.py should be good enough and the instructions to run it.

@adonnini
Copy link
Author

@tarun292 I set up a public repository with a copy of my set-up. Here is a link to the README.MD

https://github.com/adonnini/adonnini-trajectory-prediction-transformers-masterContextQ/blob/main/README.md

Please let me know if you have problems accessing the repository or have any other problems.

As I say in the Notes section, please keep in mind that I come from the Java world and am a Python beginner.

I hope this helps.

Thanks

@adonnini
Copy link
Author

adonnini commented Jun 5, 2024

@tarun292 I hope I am not bothering you too much. I know you are busy (truly I know). When do you think you will get a chance to take a look at the repository with my set-up and the error?
Please let me know if you encounter any issue when accessing the repository and running the code.
Thanks

@tarun292
Copy link
Contributor

tarun292 commented Jun 6, 2024

@adonnini definitely appreciate your patience, i haven't forgotten about this issue. I'll take a look at it this weekend for sure.

@adonnini
Copy link
Author

adonnini commented Jun 6, 2024

@tarun292 Thanks!

@adonnini
Copy link
Author

@Tarum please let me know if you have had any problems in using the code in the repository I set up. Thanks

@adonnini
Copy link
Author

@tarun292 Will you have time to take a look at this issue in the next days? As you may remember, I set up a public repository with a copy of my set-up. Here is a link to the README.MD

https://github.com/adonnini/adonnini-trajectory-prediction-transformers-masterContextQ/blob/main/README.md

Please let me know if you have problems accessing the repository or have any other problems.

As I say in the Notes section, please keep in mind that I come from the Java world and am a Python beginner.

I hope this helps.

Thanks

@tarun292
Copy link
Contributor

@adonnini sorry for the delay. Yes i will definitely take a look at it this week.

@adonnini
Copy link
Author

adonnini commented Aug 2, 2024

@tarun292 Sorry to bother you again. Do you think you will have time to take a look at this issue in the coming week? I would really appreciate it.
By the way, I know you are very busy and given higher priority tasks to take care of. For this issue, it's most important for me that when you do take a look at it again you have the time to spend on it.
Thanks for your help and understanding.

@tarun292
Copy link
Contributor

@adonnini apologies for the delay. Last few weeks have been hectic. I finally got time to clone your repro and give it a try and ran into the following issue.

(executorch) tkaruturi@tkaruturi-mbp adonnini-trajectory-prediction-transformers-masterContextQ % python train-minimum.py                         
/Users/tkaruturi/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/passes/_quant_patterns_and_replacements.py:106: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
  @impl_abstract("quantized_decomposed::embedding_byte.out")
/Users/tkaruturi/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/passes/_quant_patterns_and_replacements.py:153: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
  @impl_abstract("quantized_decomposed::embedding_byte.dtype_out")
/Users/tkaruturi/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/passes/_quant_patterns_and_replacements.py:228: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
  @impl_abstract("quantized_decomposed::embedding_4bit.out")
/Users/tkaruturi/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/passes/_quant_patterns_and_replacements.py:281: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
  @impl_abstract("quantized_decomposed::embedding_4bit.dtype_out")
              frame  ped          x         y
0     1686070624256    1  45.753380  8.312191
1     1686070624256    1  45.753380  8.312192
2     1686070624256    1  45.753380  8.312192
3     1686070624256    1  45.753387  8.312245
4     1686071410688    1  45.753407  8.312197
...             ...  ...        ...       ...
3497  1695459049472    1  45.753437  8.312192
3498  1695459049472    1  45.753433  8.312195
3499  1695459049472    1  45.753433  8.312196
3500  1695459049472    1  45.753433  8.312196
3501  1695459049472    1  45.753475  8.312253

[3502 rows x 4 columns]
Empty DataFrame
Columns: [frame, ped, x, y]
Index: []
Traceback (most recent call last):
  File "/Users/tkaruturi/Documents/Projects/adonnini-trajectory-prediction-transformers-masterContextQ/train-minimum.py", line 91, in <module>
    train_dataset, data_trg = dataloader.create_dataset(dataset_folder, dataset_name, val_size,
  File "/Users/tkaruturi/Documents/Projects/adonnini-trajectory-prediction-transformers-masterContextQ/dataloader.py", line 107, in create_dataset
    inp,out,info=get_strided_data_clust(raw_data,gt,horizon,1)
  File "/Users/tkaruturi/Documents/Projects/adonnini-trajectory-prediction-transformers-masterContextQ/dataloader.py", line 207, in get_strided_data_clust
    frames=np.stack(frame)
  File "/Users/tkaruturi/miniconda3/envs/executorch/lib/python3.10/site-packages/numpy/core/shape_base.py", line 445, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack
(executorch) tkaruturi@tkaruturi-mbp adonnini-trajectory-prediction-transformers-masterContextQ % 

I added a print(data) before get_strided_data_clust in your script and that's the data you see in the log.

@adonnini
Copy link
Author

@tarun292 Sorry. The datasets folder needs to be replaced. I have a zip archive containing the replacement datasets folder (~10.6 MB)
datasets081224.zip
attached
It should solve the problem. Just delete the datasets folder you have in the working directory and replace it with the one in the archive making sure it is named datasets.
Please let me know how things work out. Thanks

@adonnini
Copy link
Author

When you run the code with the replacement datasets folder code execution should fail producing the traceback log file reported below.
Thanks

TRACEBACK LOG FILE

Traceback (most recent call last):
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/export/_trace.py", line 1334, in _non_strict_export
    produce_guards_and_solve_constraints(
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 284, in produce_guards_and_solve_constraints
    raise constraint_violation_error
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/_export/non_strict_utils.py", line 266, in produce_guards_and_solve_constraints
    shape_env.produce_guards(
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/experimental/symbolic_shapes.py", line 3779, in produce_guards
    raise ConstraintViolationError(
torch.fx.experimental.symbolic_shapes.ConstraintViolationError: L['args'][0][0].size()[1] = 7 is not equal to L['args'][0][1].size()[1] = 12

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/adonnini-trajectory-prediction-transformers-masterContextQ/train-minimum.py", line 443, in <module>
    pre_autograd_aten_dialect = torch.export._trace._export(
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/export/_trace.py", line 945, in wrapper
    raise e
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/export/_trace.py", line 928, in wrapper
    ep = fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/export/exported_program.py", line 89, in wrapper
    return fn(*args, **kwargs)
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/export/_trace.py", line 1455, in _export
    aten_export_artifact = export_func(
  File "/home/adonnini1/anaconda3/envs/executorch/lib/python3.10/site-packages/torch/export/_trace.py", line 1344, in _non_strict_export
    raise UserError(UserErrorType.CONSTRAINT_VIOLATION, str(e))  # noqa: B904

@adonnini
Copy link
Author

@tarun292, I hope I am not being too much of a nuisance. Given how busy you are, when do you think you'll get a chance to try again to run the model (after replacing the datasets folder)?
If you think it makes sense, you might also want to take a look at the traceback log I sent you above.
Thanks

@tarun292
Copy link
Contributor

yep i can confirm that fixes the issue and i can repro the actual export issue. Will let you know once i have more insights into what the issue is.

@adonnini
Copy link
Author

Thanks! I appreciate it

@adonnini
Copy link
Author

@tarun292 , Sorry to bother you again. It's been over a month since we last connected. (When) will you be able to take another look at this issue? I am still waiting for its resolution in order to be able to proceed with some of my work.
Thanks

@adonnini
Copy link
Author

@tarun292 Again bugging you. Sorry. Please let me know if you will not be able to work on this issue. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module: kernels Issues related to kernel libraries, e.g. portable kernels and optimized kernels module: xnnpack Issues related to xnnpack delegation triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

8 participants