Failure dynamically resizing for sequence length #1260

tylerweitzman · 2021-07-15T07:23:05Z

🐞Describe the bug

Vocoder model (mel spectogram to audio wav) successfully converts with the following spec:

input {
  name: "context0"
  type {
    multiArrayType {
      shape: 1
      shape: 80
      shape: 1
      dataType: FLOAT32
      shapeRange {
        sizeRanges {
          lowerBound: 1
          upperBound: 1
        }
        sizeRanges {
          lowerBound: 80
          upperBound: 80
        }
        sizeRanges {
          lowerBound: 1
          upperBound: -1
        }
      }
    }
  }
}
output {
  name: "745"
  type {
    multiArrayType {
      dataType: FLOAT32
    }
  }
}
metadata {
  userDefined {
    key: "com.github.apple.coremltools.source"
    value: "torch==1.9.0"
  }
  userDefined {
    key: "com.github.apple.coremltools.version"
    value: "5.0b2"
  }
}

Predictions using either ct_model.predict({'context0': input.numpy()}) or ct_model.predict({'context0': input[:,:,:100].numpy()}) work demonstrating dynamic input sizes work.

Expected it should work on ANE

Actual Result
On an iPhone 12 Pro Max it fails without

var opts = MLPredictionOptions()
opts.usesCPUOnly = true

Unfortunately, this model is too slow on the CPU for a good user experience

Trace

[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Blob cannot be represented on texture: width 22272 height 1 larger than valid texture dimensions width 16384 height 16384. status=-1
[coreml] Failure dynamically resizing for sequence length.
[coreml] Failure in resetSizes.
Error Domain=com.apple.CoreML Code=0 "Failure dynamically resizing for sequence length." UserInfo={NSLocalizedDescription=Failure dynamically resizing for sequence length.}

To Reproduce

Model is proprietary and cannot be provided

System environment (please complete the following information):

conda install of python 3.7.10, torch 1.9.0, Big Sur, Xcode 12.5, iPhone 12 Pro Max on iOS 14.6, coremltools 5.0b2`

The text was updated successfully, but these errors were encountered:

tylerweitzman · 2021-07-15T20:14:58Z

It seems that the bug is caused due to this error:
[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Blob cannot be represented on texture: width 22272 height 1 larger than valid texture dimensions width 16384 height 16384. status=-1

The input I'm using is 1 x 80 x 174 with an output of 1 x 44544

This doesn't seem to be a memory issue since 16384 x 16384 = 268,435,456 so I'm confused where this limitation arises. Also, how can I trace which "blob" is trying to be represented as this texture of 1 x 22272? Could this be an intermediate value?

Do I need to rework the whole model to not exceed a certain width? Or is it just the inputs and outputs? Would appreciate more information about this

Please note that I have also tried converting with coremltools==4.0 (successfully) and the same issue appears there too

TobyRoseman · 2021-07-15T23:12:40Z

I'm not understanding what exactly the issue is here. Is it that the model is too slow? Or it's not running on the ANE? When do you get the stack trace you shared?

I understand you can't share your model. Can you give us a minimal example that reproduce this issues? Can you at least share all your Python code to convert your model and a predict call which triggers the error?

tylerweitzman · 2021-07-16T02:19:53Z

Hi @TobyRoseman , thanks so much for the swift response. Yes, the model cannot be run on the CPU because it is too slow on CPU.

Python Code

I am pasting below my python code, starting with Helper functions, Conversion code, and Model components

Helper functions:

import coremltools as ct
print('coremltools.__version__ =',ct.__version__)

def input_with(shapes):
    input_arr = []
    for i, s in enumerate(shapes):
        input_shape = ct.Shape(shape=s)
        input_tensor = ct.TensorType(name="context" + str(i), shape=input_shape, dtype=np.float32) 
        input_arr.append(input_tensor)
    return input_arr

def ct_convert(trace, ex_input, dynamic=False):
    shape = ex_input.shape
    if dynamic:
        shapeList = list(shape)
        shapeList[-1] = ct.RangeDim()
        shape = tuple(shapeList)
    return ct.convert(
        trace,
        inputs = input_with([shape]),
        minimum_ios_deployment_target='13'
    )

Highest-level model component:

class ConvertibleGenerator(torch.nn.Module):
    def __init__(self, conv_pre, ups, conv_post):
        super(ConvertibleGenerator, self).__init__()
        self.conv_pre = conv_pre
        self.ups = ups
        self.conv_post = conv_post
        
    def forward(self, x): 
        x = self.conv_pre(x)
        for _, up in enumerate(self.ups): // This loop runs 4 times
            x = F.leaky_relu(x, 0.1)
            xs = up(x)
            x = xs / 3.0
        x = F.leaky_relu(x)
        x = self.conv_post(x)
        x = torch.tanh(x)
        return x

ups is a nn.ModuleList(), each of which uses primarily ConvTranspose1d(...) and Conv1d(...) of various sizes
conv_pre: Conv1d(80, 512, kernel_size=(7,), stride=(1,), padding=(3,))
conv_post: Conv1d(32, 1, kernel_size=(7,), stride=(1,), padding=(3,))

More information on input and output sizes
Since this is likely a dimension issue. Here is more information on the flow of an input through the graph. Suppose the input to the generator is 1 x 80 x 228, where 1 x 80 is a single mel-spectogram frame to be vocoded, and 228 is the length based on how many phonemes are being synthesized in the requested utterance.

Here are the sizes of x as it goes through the model:
on input: torch.Size([1, 80, 228])
passed through conv_pre: torch.Size([1, 512, 228]) (since conv_pre takes 80 -> 512)
passed through self.ups[0]: torch.Size([1, 256, 1824]),
passed through self.ups[1]: torch.Size([1, 128, 14592]),
passed through self.ups[2]: torch.Size([1, 64, 29184])
passed through self.ups[3]: torch.Size([1, 32, 58368])
passed through conv_post: torch.Size([1, 1, 58368]) (since conv_post takes 32 -> 1)
This final output is a .wav file, in this case it is 58368 / 22050 Hz = 2.647 seconds long

**The Up Sample Component Input -> Output **
To illustrate one of the self.ups : nn.ModuleList passes in more detail, consider the one at index 0, self.ups[0](x):
• The up sample component gets torch.Size([1, 512, 228]) as input to forward
• Passes through ConvTranspose1d(512, 256, kernel_size=(16,), stride=(8,), padding=(4,)) taking it from 512 -> 256
• Passes through several symmetrical Conv1d(256, 256, ...)
• Returns a tensor of size torch.Size([1, 256, 1824])
I am writing this out instead of providing the code because self.ups[0] is actually a higher-level component, so this is a summary of the base component operations it eventually performs through lower-level components.

Conversion code:

convertibleGenerator = ConvertibleGenerator(conv_pre_traced, upModuleList, conv_post_traced)
convertibleGenerator.eval()
cg_traced = torch.jit.trace(convertibleGenerator, torch.rand((1,80,20)))
ct_model = ct_convert(cg_traced, torch.rand((1,80,20)), dynamic=True)
ct_model.save('vocoder.mlmodel')

Swift code:

    import CoreML
    import TensorFlowLite
    ...

    let model : MLModel
    init(url: URL) throws {
        model = try MLModel(contentsOf: url)
    }

    // getAudio input comes from a TTS model ran in TensorFlowLite in another class using this component
    func getAudio(input: Tensor) throws -> Data {
        let nsData = input.data as NSData
        let rawPtr = nsData.bytes
        let pointer = UnsafeMutablePointer<Float32>(OpaquePointer(rawPtr))
        print("input shape", input.shape)
        let melFrame = try MLMultiArray(dataPointer: pointer, shape: [1, 80, 228] as [NSNumber], dataType: .float, strides: [18240, 228, 1] as [NSNumber], deallocator: { rawPtr in
        }) // 228 number is dynamic for other inputs besides current test input, varying with length of the sentence
        let features = VocoderFeatureProvider(melFrame: melFrame)
        var opts = MLPredictionOptions()
        opts.usesCPUOnly = false // When set to true the model runs without error. Fails when set to false
        var modelOutput = try model.prediction(from: features, options: opts)
        var audioTensor = modelOutput.featureValue(for: "745")!.multiArrayValue!
        let data = Data.init(bytes: audioTensor.dataPointer, count: 58368) // count is dynamic, hard-coded only for testing
        return data
    }

Coice · 2022-01-21T08:18:15Z

@tylerweitzman Did you ever find a solution for this?

tylerweitzman · 2022-01-21T11:55:06Z

I did not. Any suggestions?

…

On Fri, Jan 21, 2022 at 3:18 AM Coice ***@***.***> wrote: @tylerweitzman <https://github.com/tylerweitzman> Did you ever find a solution for this? — Reply to this email directly, view it on GitHub <#1260 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAP3R7SJXKSEQZT47XHDOUTUXEJFFANCNFSM5AM6IRTQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

TobyRoseman · 2022-01-21T20:01:37Z

Is this still an issue on macOS Monterey?

tylerweitzman · 2022-01-22T00:21:01Z

I haven’t tried it on Monterey but I’m not sure that could resolve the issue because it’s a problem with iOS? It runs fine on Big Sur in the Jupypter notebook but the moment I try to run it with core ml on iOS and without falling back on the cpu only flag it crashes with this error. Are you suggesting that coreml Python tools might run substantially different on Monterey than on Big Sur in how they convert the model to core ml format?

On Fri, Jan 21, 2022 at 3:01 PM Toby Roseman ***@***.***> wrote: Is this still an issue on macOS Monterey? — Reply to this email directly, view it on GitHub <#1260 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAP3R7TM3PGJTR6WV6CLSMLUXG3S3ANCNFSM5AM6IRTQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- *---* *Tyler Weitzman* *LinkedIn <http://linkedin.com/in/tylerweitzman>* *BlackSMS <http://blacksms.info/>*

tylerweitzman · 2022-01-22T00:22:26Z

Sorry, I just want to clarify that the main issue that is persistent is the limit on cov1D to 16,000 length on ANE. The dynamic shaping isn’t problematic under that length, though it took a while to get working

On Fri, Jan 21, 2022 at 7:20 PM Tyler Weitzman ***@***.***> wrote: I haven’t tried it on Monterey but I’m not sure that could resolve the issue because it’s a problem with iOS? It runs fine on Big Sur in the Jupypter notebook but the moment I try to run it with core ml on iOS and without falling back on the cpu only flag it crashes with this error. Are you suggesting that coreml Python tools might run substantially different on Monterey than on Big Sur in how they convert the model to core ml format? On Fri, Jan 21, 2022 at 3:01 PM Toby Roseman ***@***.***> wrote: > Is this still an issue on macOS Monterey? > > — > Reply to this email directly, view it on GitHub > <#1260 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAP3R7TM3PGJTR6WV6CLSMLUXG3S3ANCNFSM5AM6IRTQ> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> > -- *---* *Tyler Weitzman* *LinkedIn <http://linkedin.com/in/tylerweitzman>* *BlackSMS <http://blacksms.info/>*

-- *---* *Tyler Weitzman* *LinkedIn <http://linkedin.com/in/tylerweitzman>* *BlackSMS <http://blacksms.info/>*

TobyRoseman · 2022-01-24T23:50:30Z

Thanks for the further information. It now seems clear: this is not an issue with the coremltools Python package, but an issue with the Core ML Framework.

Please see if this issue is fixed with the latest version of iOS.

If it is still an issue, you can report it here: https://developer.apple.com/bug-reporting/
You could also discuss this issue in the developer forum: https://developer.apple.com/forums/

tylerweitzman added the bug Unexpected behaviour that should be corrected (type) label Jul 15, 2021

TobyRoseman added PyTorch (traced) Core ML Framework An issue related to the Core ML Framework labels Oct 22, 2021

TobyRoseman closed this as completed Jan 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure dynamically resizing for sequence length #1260

Failure dynamically resizing for sequence length #1260

tylerweitzman commented Jul 15, 2021 •

edited

tylerweitzman commented Jul 15, 2021

TobyRoseman commented Jul 15, 2021

tylerweitzman commented Jul 16, 2021 •

edited

Coice commented Jan 21, 2022

tylerweitzman commented Jan 21, 2022 via email

TobyRoseman commented Jan 21, 2022

tylerweitzman commented Jan 22, 2022 via email

tylerweitzman commented Jan 22, 2022 via email

TobyRoseman commented Jan 24, 2022

Failure dynamically resizing for sequence length #1260

Failure dynamically resizing for sequence length #1260

Comments

tylerweitzman commented Jul 15, 2021 • edited

🐞Describe the bug

Trace

To Reproduce

System environment (please complete the following information):

tylerweitzman commented Jul 15, 2021

TobyRoseman commented Jul 15, 2021

tylerweitzman commented Jul 16, 2021 • edited

Python Code

Swift code:

Coice commented Jan 21, 2022

tylerweitzman commented Jan 21, 2022 via email

TobyRoseman commented Jan 21, 2022

tylerweitzman commented Jan 22, 2022 via email

tylerweitzman commented Jan 22, 2022 via email

TobyRoseman commented Jan 24, 2022

tylerweitzman commented Jul 15, 2021 •

edited

tylerweitzman commented Jul 16, 2021 •

edited