Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent no-reply with PDF translation + workaround #14

Open
boeryepes opened this issue Mar 24, 2022 · 7 comments
Open

inconsistent no-reply with PDF translation + workaround #14

boeryepes opened this issue Mar 24, 2022 · 7 comments

Comments

@boeryepes
Copy link

hi, to translate documents I use in my code the Await Translator.TranslateDocumentAsync (...) task from your API. This works flawless for DOCX and PPTX but on PDF files it is inconsistent. Regularly there is no completion of the translation task. It is not reproducible and I found no relation to file size, content, name etc.

I then checked the implementation of the task in the DeepL .NET API VS solution and it is this:

        handle = await TranslateDocumentUploadAsync(...).ConfigureAwait(false);
        await TranslateDocumentWaitUntilDoneAsync(handle.Value, cancellationToken).ConfigureAwait(false);
        await TranslateDocumentDownloadAsync(handle.Value, outputFile, cancellationToken).ConfigureAwait(false);

So I replaced Await Translator.TranslateDocumentAsync () with the above, as follows.

Dim docHandle As Model.DocumentHandle = Await _translator.TranslateDocumentUploadAsync(...)
Await _translator.TranslateDocumentWaitUntilDoneAsync(docHandle)
Await _translator.TranslateDocumentDownloadAsync(docHandle, outputFileInfo)

Using these 3 calls separately always work with PDFs and the other file types so I wonder why the single API call 'TranslateDocumentAsync' does not always work for PDFs. The only difference is that I do not use .ConfigureAwait(false) because based on the documentation and other developers, it is not needed.

Can you advise what is going on?

@dieter-enns-deepl
Copy link
Contributor

Hi,

first I would like to explain why we are using ConfigureAwait(false). It is the recommended way for avoiding the potential of deadlocks in libraries. Functionally, ConfigureAwait(false) should just allow for continuations (code after the await) to be executed from another synchronization context than the one which called the async function in the first place. So on paper, using ConfigureAwait(false) should be safer than leaving it out. If you are interested to learn more about this topic, I recommend you following article which explains it in more details and better than I ever could: https://devblogs.microsoft.com/dotnet/configureawait-faq/#why-would-i-want-to-use-configureawaitfalse

Another thing raises even more questions to me. The three asynchronous functions which you started to call on your own contain awaited calls with ConfigureAwait(false) themselves:

  • public async Task<DocumentHandle> TranslateDocumentUploadAsync(
    Stream inputFile,
    string inputFileName,
    string? sourceLanguageCode,
    string targetLanguageCode,
    DocumentTranslateOptions? options = null,
    CancellationToken cancellationToken = default) {
    var bodyParams = CreateHttpParams(
    sourceLanguageCode,
    targetLanguageCode,
    options);
    using var responseMessage = await _client.ApiUploadAsync(
    "/v2/document/",
    cancellationToken,
    bodyParams,
    inputFile,
    inputFileName)
    .ConfigureAwait(false);
    await DeepLClient.CheckStatusCodeAsync(responseMessage).ConfigureAwait(false);
    return await JsonUtils.DeserializeAsync<DocumentHandle>(responseMessage).ConfigureAwait(false);
    }
  • public async Task TranslateDocumentWaitUntilDoneAsync(
    DocumentHandle handle,
    CancellationToken cancellationToken = default) {
    var status = await TranslateDocumentStatusAsync(handle, cancellationToken).ConfigureAwait(false);
    while (status.Ok && !status.Done) {
    await Task.Delay(CalculateDocumentWaitTime(status.SecondsRemaining), cancellationToken).ConfigureAwait(false);
    status = await TranslateDocumentStatusAsync(handle, cancellationToken).ConfigureAwait(false);
    }
    if (!status.Ok) {
    throw new DeepLException("Document translation resulted in an error");
    }
    }
  • public async Task TranslateDocumentDownloadAsync(
    DocumentHandle handle,
    Stream outputFile,
    CancellationToken cancellationToken = default) {
    var bodyParams = new (string Key, string Value)[] { ("document_key", handle.DocumentKey) };
    using var responseMessage = await _client.ApiPostAsync(
    $"/v2/document/{handle.DocumentId}/result",
    cancellationToken,
    bodyParams)
    .ConfigureAwait(false);
    await DeepLClient.CheckStatusCodeAsync(responseMessage, downloadingDocument: true).ConfigureAwait(false);
    await responseMessage.Content.CopyToAsync(outputFile).ConfigureAwait(false);
    }

    Coincidentally, it is even three calls with ConfigureAwait(false) each, too (just as a side note; the amount shouldn't matter much). So if the ConfigureAwait(false) is the thing which causes the faulty behavior in the single TranslateDocumentAsync-call, I wonder, shouldn't it cause trouble in the functions TranslateDocumentUploadAsync, TranslateDocumentWaitUntilDoneAsync & TranslateDocumentDownloadAsync as well? That riddles me.

I've tried to reproduce the issue and came up with (I am no native VB.NET programmer, so please bear with me):

Imports System.IO
Imports DeepL

Module Module1
  Sub Main()
    Translate().Wait() ' Blocking call because I don't know how to do an async Main in VB.Net
  End Sub

  Private Async Function Translate() As Task
    Dim translator = new Translator("[Private]")
    Await translator.TranslateDocumentAsync(
      New FileInfo("C:\[Private].pdf"),
      New FileInfo("C:\[Private]_translated.pdf"),
      "de",
      "en-US").ConfigureAwait(false) ' I also tried without ".ConfigureAwait(false)" with same result
  End Function
End Module

I couldn't reproduce the issue. But it took quite a while until completion. I would need to ask a colleague if that matters, but I suspect that processing a PDF takes more time compared to our other supported file formats. But of course I also don't know how long you have waited.

In conclusion, unfortunately, I can't tell you the reasons why it won't work. I have the suspicion that it coincidentally looks as if .ConfigureAwait(false) is the influencing factor. However, I have yet no clue what other reason might cause the faulty behavior in your case. Sorry.

If that would help you, we could extend our library to be configurable by the TranslatorOptions of whether false or true should be passed to .ConfigureAwait(bool) with false as default for backward compatibility. Functionally, .ConfigureAwait(true) should behave neutral and as if .ConfigureAwait(bool) wasn't used in the first place.

@boeryepes
Copy link
Author

boeryepes commented Apr 1, 2022 via email

@daniel-jones-deepl
Copy link
Member

Hi Klaas,

Thank you for testing the issue on your side. Regarding the quota, we can cancel those document translations from your quota. To do so we need to identify your account, could you please contact our support team?

@boeryepes
Copy link
Author

boeryepes commented Apr 5, 2022 via email

@daniel-jones-deepl
Copy link
Member

Hi Klaas,
Your email address is censored in your messages; I guess that GitHub censors your email address when you reply via email.
Regards,
Daniel

@boeryepes
Copy link
Author

boeryepes commented Apr 7, 2022 via email

@daniel-jones-deepl
Copy link
Member

daniel-jones-deepl commented Apr 14, 2022

Hi Klaas,
GitHub censors your email, all I see is it's ***@***.******@***.***. You can see it online on GitHub here: #14 (comment)

Please send an email to open-source@deepl.com, so I can use your email address to identify your DeepL account.
Regards,
Daniel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants