Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure dynamically resizing for sequence length #1260

Closed
tylerweitzman opened this issue Jul 15, 2021 · 9 comments
Closed

Failure dynamically resizing for sequence length #1260

tylerweitzman opened this issue Jul 15, 2021 · 9 comments
Labels
bug Unexpected behaviour that should be corrected (type) Core ML Framework An issue related to the Core ML Framework PyTorch (traced)

Comments

@tylerweitzman
Copy link

tylerweitzman commented Jul 15, 2021

🐞Describe the bug

Vocoder model (mel spectogram to audio wav) successfully converts with the following spec:

input {
  name: "context0"
  type {
    multiArrayType {
      shape: 1
      shape: 80
      shape: 1
      dataType: FLOAT32
      shapeRange {
        sizeRanges {
          lowerBound: 1
          upperBound: 1
        }
        sizeRanges {
          lowerBound: 80
          upperBound: 80
        }
        sizeRanges {
          lowerBound: 1
          upperBound: -1
        }
      }
    }
  }
}
output {
  name: "745"
  type {
    multiArrayType {
      dataType: FLOAT32
    }
  }
}
metadata {
  userDefined {
    key: "com.github.apple.coremltools.source"
    value: "torch==1.9.0"
  }
  userDefined {
    key: "com.github.apple.coremltools.version"
    value: "5.0b2"
  }
}

Predictions using either ct_model.predict({'context0': input.numpy()}) or ct_model.predict({'context0': input[:,:,:100].numpy()}) work demonstrating dynamic input sizes work.

Expected it should work on ANE

Actual Result
On an iPhone 12 Pro Max it fails without

var opts = MLPredictionOptions()
opts.usesCPUOnly = true

Unfortunately, this model is too slow on the CPU for a good user experience

Trace

[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Blob cannot be represented on texture: width 22272 height 1 larger than valid texture dimensions width 16384 height 16384. status=-1
[coreml] Failure dynamically resizing for sequence length.
[coreml] Failure in resetSizes.
Error Domain=com.apple.CoreML Code=0 "Failure dynamically resizing for sequence length." UserInfo={NSLocalizedDescription=Failure dynamically resizing for sequence length.}

To Reproduce

Model is proprietary and cannot be provided

System environment (please complete the following information):

conda install of python 3.7.10, torch 1.9.0, Big Sur, Xcode 12.5, iPhone 12 Pro Max on iOS 14.6, coremltools 5.0b2`

@tylerweitzman tylerweitzman added the bug Unexpected behaviour that should be corrected (type) label Jul 15, 2021
@tylerweitzman
Copy link
Author

It seems that the bug is caused due to this error:
[espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Blob cannot be represented on texture: width 22272 height 1 larger than valid texture dimensions width 16384 height 16384. status=-1

The input I'm using is 1 x 80 x 174 with an output of 1 x 44544

This doesn't seem to be a memory issue since 16384 x 16384 = 268,435,456 so I'm confused where this limitation arises. Also, how can I trace which "blob" is trying to be represented as this texture of 1 x 22272? Could this be an intermediate value?

Do I need to rework the whole model to not exceed a certain width? Or is it just the inputs and outputs? Would appreciate more information about this

Please note that I have also tried converting with coremltools==4.0 (successfully) and the same issue appears there too

@TobyRoseman
Copy link
Collaborator

I'm not understanding what exactly the issue is here. Is it that the model is too slow? Or it's not running on the ANE? When do you get the stack trace you shared?

I understand you can't share your model. Can you give us a minimal example that reproduce this issues? Can you at least share all your Python code to convert your model and a predict call which triggers the error?

@tylerweitzman
Copy link
Author

tylerweitzman commented Jul 16, 2021

Hi @TobyRoseman , thanks so much for the swift response. Yes, the model cannot be run on the CPU because it is too slow on CPU.

Python Code

I am pasting below my python code, starting with Helper functions, Conversion code, and Model components

Helper functions:

import coremltools as ct
print('coremltools.__version__ =',ct.__version__)

def input_with(shapes):
    input_arr = []
    for i, s in enumerate(shapes):
        input_shape = ct.Shape(shape=s)
        input_tensor = ct.TensorType(name="context" + str(i), shape=input_shape, dtype=np.float32) 
        input_arr.append(input_tensor)
    return input_arr

def ct_convert(trace, ex_input, dynamic=False):
    shape = ex_input.shape
    if dynamic:
        shapeList = list(shape)
        shapeList[-1] = ct.RangeDim()
        shape = tuple(shapeList)
    return ct.convert(
        trace,
        inputs = input_with([shape]),
        minimum_ios_deployment_target='13'
    )

Highest-level model component:

class ConvertibleGenerator(torch.nn.Module):
    def __init__(self, conv_pre, ups, conv_post):
        super(ConvertibleGenerator, self).__init__()
        self.conv_pre = conv_pre
        self.ups = ups
        self.conv_post = conv_post
        
    def forward(self, x): 
        x = self.conv_pre(x)
        for _, up in enumerate(self.ups): // This loop runs 4 times
            x = F.leaky_relu(x, 0.1)
            xs = up(x)
            x = xs / 3.0
        x = F.leaky_relu(x)
        x = self.conv_post(x)
        x = torch.tanh(x)
        return x

ups is a nn.ModuleList(), each of which uses primarily ConvTranspose1d(...) and Conv1d(...) of various sizes
conv_pre: Conv1d(80, 512, kernel_size=(7,), stride=(1,), padding=(3,))
conv_post: Conv1d(32, 1, kernel_size=(7,), stride=(1,), padding=(3,))

More information on input and output sizes
Since this is likely a dimension issue. Here is more information on the flow of an input through the graph. Suppose the input to the generator is 1 x 80 x 228, where 1 x 80 is a single mel-spectogram frame to be vocoded, and 228 is the length based on how many phonemes are being synthesized in the requested utterance.

Here are the sizes of x as it goes through the model:
on input: torch.Size([1, 80, 228])
passed through conv_pre: torch.Size([1, 512, 228]) (since conv_pre takes 80 -> 512)
passed through self.ups[0]: torch.Size([1, 256, 1824]),
passed through self.ups[1]: torch.Size([1, 128, 14592]),
passed through self.ups[2]: torch.Size([1, 64, 29184])
passed through self.ups[3]: torch.Size([1, 32, 58368])
passed through conv_post: torch.Size([1, 1, 58368]) (since conv_post takes 32 -> 1)
This final output is a .wav file, in this case it is 58368 / 22050 Hz = 2.647 seconds long

**The Up Sample Component Input -> Output **
To illustrate one of the self.ups : nn.ModuleList passes in more detail, consider the one at index 0, self.ups[0](x):
• The up sample component gets torch.Size([1, 512, 228]) as input to forward
• Passes through ConvTranspose1d(512, 256, kernel_size=(16,), stride=(8,), padding=(4,)) taking it from 512 -> 256
• Passes through several symmetrical Conv1d(256, 256, ...)
• Returns a tensor of size torch.Size([1, 256, 1824])
I am writing this out instead of providing the code because self.ups[0] is actually a higher-level component, so this is a summary of the base component operations it eventually performs through lower-level components.

Conversion code:

convertibleGenerator = ConvertibleGenerator(conv_pre_traced, upModuleList, conv_post_traced)
convertibleGenerator.eval()
cg_traced = torch.jit.trace(convertibleGenerator, torch.rand((1,80,20)))
ct_model = ct_convert(cg_traced, torch.rand((1,80,20)), dynamic=True)
ct_model.save('vocoder.mlmodel')

Swift code:

    import CoreML
    import TensorFlowLite
    ...

    let model : MLModel
    init(url: URL) throws {
        model = try MLModel(contentsOf: url)
    }

    // getAudio input comes from a TTS model ran in TensorFlowLite in another class using this component
    func getAudio(input: Tensor) throws -> Data {
        let nsData = input.data as NSData
        let rawPtr = nsData.bytes
        let pointer = UnsafeMutablePointer<Float32>(OpaquePointer(rawPtr))
        print("input shape", input.shape)
        let melFrame = try MLMultiArray(dataPointer: pointer, shape: [1, 80, 228] as [NSNumber], dataType: .float, strides: [18240, 228, 1] as [NSNumber], deallocator: { rawPtr in
        }) // 228 number is dynamic for other inputs besides current test input, varying with length of the sentence
        let features = VocoderFeatureProvider(melFrame: melFrame)
        var opts = MLPredictionOptions()
        opts.usesCPUOnly = false // When set to true the model runs without error. Fails when set to false
        var modelOutput = try model.prediction(from: features, options: opts)
        var audioTensor = modelOutput.featureValue(for: "745")!.multiArrayValue!
        let data = Data.init(bytes: audioTensor.dataPointer, count: 58368) // count is dynamic, hard-coded only for testing
        return data
    }

@TobyRoseman TobyRoseman added PyTorch (traced) Core ML Framework An issue related to the Core ML Framework labels Oct 22, 2021
@Coice
Copy link

Coice commented Jan 21, 2022

@tylerweitzman Did you ever find a solution for this?

@tylerweitzman
Copy link
Author

tylerweitzman commented Jan 21, 2022 via email

@TobyRoseman
Copy link
Collaborator

Is this still an issue on macOS Monterey?

@tylerweitzman
Copy link
Author

tylerweitzman commented Jan 22, 2022 via email

@tylerweitzman
Copy link
Author

tylerweitzman commented Jan 22, 2022 via email

@TobyRoseman
Copy link
Collaborator

Thanks for the further information. It now seems clear: this is not an issue with the coremltools Python package, but an issue with the Core ML Framework.

Please see if this issue is fixed with the latest version of iOS.

If it is still an issue, you can report it here: https://developer.apple.com/bug-reporting/
You could also discuss this issue in the developer forum: https://developer.apple.com/forums/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unexpected behaviour that should be corrected (type) Core ML Framework An issue related to the Core ML Framework PyTorch (traced)
Projects
None yet
Development

No branches or pull requests

3 participants