New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

escape function crashes when escaping long Chinese strings #206

Closed
PrideChung opened this Issue Nov 5, 2014 · 8 comments

Comments

Projects
None yet
6 participants
@PrideChung

PrideChung commented Nov 5, 2014

This problem would be easier to reproduce on iOS devices with smaller RAM, I first found it on an iPod Touch 5 with 512MB RAM. It crashes everytime when I try to encode a POST parameter with more than 500 Chinese characters, and it only gives me an EXC_BAD_ACCESS with no further information. On iPhone 5 with 1GB RAM it can handle about 1300 Chinese characters but still crash if the limitation is exceeded.

For quick reference, this is the escape function I copied from Alamofire.swift line 154

func escape(string: String) -> String {
    let allowedCharacters =  NSCharacterSet(charactersInString:" =\"#%/<>?@\\^`{}[]|&+").invertedSet
    return string.stringByAddingPercentEncodingWithAllowedCharacters(allowedCharacters) ?? string
}

I can almost sure the reason of the crash is stringByAddingPercentEncodingWithAllowedCharacters takes too much memory, this is particularly obvious when it's encoding Chinese or Japanese strings, maybe because the encoding process is more complex for these languages.

I discovered that if I slice the long string into small pieces and call stringByAddingPercentEncodingWithAllowedCharacters in batches like below will fix the memory peak problem,

func escape(string: String) -> String {
    let allowedCharacters =  NSCharacterSet(charactersInString:" =\"#%/<>?@\\^`{}[]|&+").invertedSet
    let batchSize = 100
    var escapedString = ""
    let stringLength = countElements(string)
    for var i = 0; i < stringLength; i += batchSize {
        let rangeLength = i + batchSize > stringLength ? stringLength - i : batchSize;
        let slicedString = (string as NSString).substringWithRange(NSMakeRange(i, rangeLength))
        escapedString += slicedString.stringByAddingPercentEncodingWithAllowedCharacters(allowedCharacters) ?? slicedString
    }
    return escapedString
}

I don't even need to place autoreleasepool between each for loops, I suspect there is already a autoreleasepool inside the implementation of stringByAddingPercentEncodingWithAllowedCharacters

Then I was wondering why I've never encountered this issue in AFNetworking, so I read the source code of AFURLRequestSerialization. Turns out stringByAddingPercentEncodingWithAllowedCharacters is only available after iOS 7 and AFNetworking supports iOS 6 so it use the lower level API CFURLCreateStringByAddingPercentEscapes which does not suffering from this memory peak issue, or at least not so obvious.

So here're two possible solutions we have for now:

  1. Do it in batches
  2. Use the lower level API like AFNetworking does.

An example project is provided to reproduce this problem: https://github.com/PrideChung/AlamofireEscapeCrashExample

@mattt

This comment has been minimized.

Show comment
Hide comment
@mattt

mattt Nov 6, 2014

Contributor

Thanks so much for the thoughtful writeup of this issue. It's really surprising that stringByAddingPercentEncodingWithAllowedCharacters would have such a glaring memory consumption issue. If you haven't already, I would strongly encourage you to file a radar about this.

Since this behavior is not produced in AFNetworking, it makes sense to just adopt the approach used there. dc1fd5c does exactly this, by porting over the same CFURLCreateStringByAddingPercentEscapes call used by AFNetworking.

Please let me know if this does not resolve your issue.

Contributor

mattt commented Nov 6, 2014

Thanks so much for the thoughtful writeup of this issue. It's really surprising that stringByAddingPercentEncodingWithAllowedCharacters would have such a glaring memory consumption issue. If you haven't already, I would strongly encourage you to file a radar about this.

Since this behavior is not produced in AFNetworking, it makes sense to just adopt the approach used there. dc1fd5c does exactly this, by porting over the same CFURLCreateStringByAddingPercentEscapes call used by AFNetworking.

Please let me know if this does not resolve your issue.

@mattt mattt closed this Nov 6, 2014

mattt added a commit that referenced this issue Nov 6, 2014

[Issue #206] Replacing stringByAddingPercentEncodingWithAllowedCharac…
…ters with CFURLCreateStringByAddingPercentEscapes to prevent out-of-memory exception
@PrideChung

This comment has been minimized.

Show comment
Hide comment
@PrideChung

PrideChung Nov 6, 2014

I can confirm this issue has been fixed, thanks for the hard work.

PrideChung commented Nov 6, 2014

I can confirm this issue has been fixed, thanks for the hard work.

@TheWalkingDead1024

This comment has been minimized.

Show comment
Hide comment
@TheWalkingDead1024

TheWalkingDead1024 Sep 9, 2015

what about Afnetworking 2.0? It could also crash when escaping long Chinese strings. It crash at stringByAddingPercentEncodingWithAllowedCharacters in this function "static NSString * AFPercentEscapedStringFromString(NSString *string) ". Those are my Chinese strings "一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十".

TheWalkingDead1024 commented Sep 9, 2015

what about Afnetworking 2.0? It could also crash when escaping long Chinese strings. It crash at stringByAddingPercentEncodingWithAllowedCharacters in this function "static NSString * AFPercentEscapedStringFromString(NSString *string) ". Those are my Chinese strings "一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十一二三四五六七八九十".

@cnoon

This comment has been minimized.

Show comment
Hide comment
@cnoon

cnoon Sep 9, 2015

Member

@TheWalkingDead1024, if you are having an issue in AFNetworking, then please open a new issue in that project.

Member

cnoon commented Sep 9, 2015

@TheWalkingDead1024, if you are having an issue in AFNetworking, then please open a new issue in that project.

@TheWalkingDead1024

This comment has been minimized.

Show comment
Hide comment
@TheWalkingDead1024

TheWalkingDead1024 commented Sep 9, 2015

OK.Thank you.

@cnoon

This comment has been minimized.

Show comment
Hide comment
@cnoon

cnoon Sep 20, 2015

Member

cc @kcharwood

Thank you for reporting this issue @TheWalkingDead1024. My apologies for first pointing you towards the AFNetworking project. This issue actually spans both projects.

The approach previously taken to use the CFURLCreateStringByAddingPercentEscapes Core Foundation call was perfectly fine until the call was deprecated in the iOS 9 SDK. Therefore, we needed to move off that call back to stringByAddingPercentEncodingWithAllowedCharacters to get away from the deprecation. Unfortunately, when that change was made, we forgot about this previously closed issue.

After much debugging, I was able to track this issue down to only occurring in Alamofire on iOS 8.1 and 8.2 using the iPhone 4S and iPhone 5 simulators. It is 100% reproducible, but is crashing in different ways depending on the size of the Chinese string that is passed in. It's always some form of a malloc error. The following sample code will reproduce the issue in the iOS Example app.

AppDelegate.swift

import Alamofire
import UIKit

@UIApplicationMain
class AppDelegate: UIResponder, UIApplicationDelegate, UISplitViewControllerDelegate {

    var window: UIWindow?

    // MARK: - UIApplicationDelegate

    func application(
        application: UIApplication,
        didFinishLaunchingWithOptions launchOptions: [NSObject : AnyObject]?)
        -> Bool
    {
        dispatch_after(
            dispatch_time(DISPATCH_TIME_NOW, Int64(2.0 * Float(NSEC_PER_SEC))),
            dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0))
        {
            self.runLargeChineseCharacterEncodingTest()
        }

        return true
    }

    func runLargeChineseCharacterEncodingTest() {
        let repeatedCount = 20_000
        let URL = NSURL(string: "https://example.com/movies")!
        let parameters = ["chinese": String(count: repeatedCount, repeatedString: "一二三四五六七八九十")]

        let start = NSDate()
        print("starting parameter encoding")
        ParameterEncoding.URL.encode(NSURLRequest(URL: URL), parameters: parameters)
        print("finished parameter encoding: \(NSDate().timeIntervalSinceDate(start)) seconds")
    }
}

extension String {
    private init(count: Int, repeatedString: String) {
        var value = ""
        for _ in 0..<count { value += repeatedString }
        self = value
    }
}

I've also added a unit test in 4f6b295 to verify that the encoding works as expected. The test will fail 100% of the time without the solution in place. Unfortunately it causes the test suite to crash rather than fail without the solution implemented, but that's the way it is due to the nature of the issue.

Unit Test

func testURLParameterEncodeStringWithThousandsOfChineseCharacters() {
    // Given
    let repeatedCount = 2_000
    let URL = NSURL(string: "https://example.com/movies")!
    let parameters = ["chinese": String(count: repeatedCount, repeatedString: "一二三四五六七八九十")]

    // When
    let (URLRequest, _) = encoding.encode(NSURLRequest(URL: URL), parameters: parameters)

    // Then
    var expected = "chinese="
    for _ in 0..<repeatedCount {
        expected += "%E4%B8%80%E4%BA%8C%E4%B8%89%E5%9B%9B%E4%BA%94%E5%85%AD%E4%B8%83%E5%85%AB%E4%B9%9D%E5%8D%81"
    }
    XCTAssertEqual(URLRequest.URL?.query ?? "", expected, "query is incorrect")
}

Solution

As reported originally by @PrideChung, the only workaround is to limit the encoding calls to a character limit by splitting the operation into a batch call. This is the approach I used in 4f6b295 which does in fact fix the problem. I've also added documentation to the escape implementation to remove the batching once iOS 8 is no longer supported by Alamofire.

For now, I've set the batchLimit to 50 which tends to add about a 20% overhead. This overhead is negligible in most cases until the parameter to encode gets extremely large. In the event of extremely large parameter data, the encoding should be done asynchronously.

Member

cnoon commented Sep 20, 2015

cc @kcharwood

Thank you for reporting this issue @TheWalkingDead1024. My apologies for first pointing you towards the AFNetworking project. This issue actually spans both projects.

The approach previously taken to use the CFURLCreateStringByAddingPercentEscapes Core Foundation call was perfectly fine until the call was deprecated in the iOS 9 SDK. Therefore, we needed to move off that call back to stringByAddingPercentEncodingWithAllowedCharacters to get away from the deprecation. Unfortunately, when that change was made, we forgot about this previously closed issue.

After much debugging, I was able to track this issue down to only occurring in Alamofire on iOS 8.1 and 8.2 using the iPhone 4S and iPhone 5 simulators. It is 100% reproducible, but is crashing in different ways depending on the size of the Chinese string that is passed in. It's always some form of a malloc error. The following sample code will reproduce the issue in the iOS Example app.

AppDelegate.swift

import Alamofire
import UIKit

@UIApplicationMain
class AppDelegate: UIResponder, UIApplicationDelegate, UISplitViewControllerDelegate {

    var window: UIWindow?

    // MARK: - UIApplicationDelegate

    func application(
        application: UIApplication,
        didFinishLaunchingWithOptions launchOptions: [NSObject : AnyObject]?)
        -> Bool
    {
        dispatch_after(
            dispatch_time(DISPATCH_TIME_NOW, Int64(2.0 * Float(NSEC_PER_SEC))),
            dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0))
        {
            self.runLargeChineseCharacterEncodingTest()
        }

        return true
    }

    func runLargeChineseCharacterEncodingTest() {
        let repeatedCount = 20_000
        let URL = NSURL(string: "https://example.com/movies")!
        let parameters = ["chinese": String(count: repeatedCount, repeatedString: "一二三四五六七八九十")]

        let start = NSDate()
        print("starting parameter encoding")
        ParameterEncoding.URL.encode(NSURLRequest(URL: URL), parameters: parameters)
        print("finished parameter encoding: \(NSDate().timeIntervalSinceDate(start)) seconds")
    }
}

extension String {
    private init(count: Int, repeatedString: String) {
        var value = ""
        for _ in 0..<count { value += repeatedString }
        self = value
    }
}

I've also added a unit test in 4f6b295 to verify that the encoding works as expected. The test will fail 100% of the time without the solution in place. Unfortunately it causes the test suite to crash rather than fail without the solution implemented, but that's the way it is due to the nature of the issue.

Unit Test

func testURLParameterEncodeStringWithThousandsOfChineseCharacters() {
    // Given
    let repeatedCount = 2_000
    let URL = NSURL(string: "https://example.com/movies")!
    let parameters = ["chinese": String(count: repeatedCount, repeatedString: "一二三四五六七八九十")]

    // When
    let (URLRequest, _) = encoding.encode(NSURLRequest(URL: URL), parameters: parameters)

    // Then
    var expected = "chinese="
    for _ in 0..<repeatedCount {
        expected += "%E4%B8%80%E4%BA%8C%E4%B8%89%E5%9B%9B%E4%BA%94%E5%85%AD%E4%B8%83%E5%85%AB%E4%B9%9D%E5%8D%81"
    }
    XCTAssertEqual(URLRequest.URL?.query ?? "", expected, "query is incorrect")
}

Solution

As reported originally by @PrideChung, the only workaround is to limit the encoding calls to a character limit by splitting the operation into a batch call. This is the approach I used in 4f6b295 which does in fact fix the problem. I've also added documentation to the escape implementation to remove the batching once iOS 8 is no longer supported by Alamofire.

For now, I've set the batchLimit to 50 which tends to add about a 20% overhead. This overhead is negligible in most cases until the parameter to encode gets extremely large. In the event of extremely large parameter data, the encoding should be done asynchronously.

@cnoon cnoon added this to the 2.0.2 milestone Sep 20, 2015

@cnoon cnoon self-assigned this Sep 20, 2015

scottdelly added a commit to Staance/Alamofire that referenced this issue Sep 21, 2015

Merge branch 'master' into swift-2.0
* master:
  Updated the JSON response serialization code sample in the README.
  Added release notes to the CHANGELOG and bumped the version to 2.0.2.
  [Issue Alamofire#206] Fixed percent encoding issue for long chinese strings.
  Updated the Embedded Framework documentation to include `git init` info.
  Fixed typo in the Embedded Framework section of the README.
  Added Alamofire iOS Tests as Target Dependency to Alamofire iOS framework.
  Updated the release notes in the CHANGELOG and bumped the version to 2.0.1.
  Removed an unnecessary separator in the ParameterEncoding enum.
  Removed unnecessary calls to self in SessionDelegate...no functional changes.
  Fixed issue where willCacheResponse method was potentially not being called.
  [Issue Alamofire#721] Reverted respondsToSelector override removal in SessionDelegate.

@RoCry RoCry referenced this issue Sep 23, 2015

Open

2015-09-23 #53

@tewha

This comment has been minimized.

Show comment
Hide comment
@tewha

tewha Mar 20, 2017

Does stringByAddingPercentEncodingWithAllowedCharacters still crash?

tewha commented Mar 20, 2017

Does stringByAddingPercentEncodingWithAllowedCharacters still crash?

@jshier

This comment has been minimized.

Show comment
Hide comment
@jshier

jshier Mar 20, 2017

Contributor

It was only found to crash for iOS 8.1 and 8.2, hence our workaround only affects those OSes. There shouldn't be any further issue here. If you are seeing a new crash, please open a new issue.

Contributor

jshier commented Mar 20, 2017

It was only found to crash for iOS 8.1 and 8.2, hence our workaround only affects those OSes. There shouldn't be any further issue here. If you are seeing a new crash, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment