Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] monaco.editor.tokenize does not return correct results #3448

Open
1 of 2 tasks
princefishthrower opened this issue Dec 3, 2022 · 6 comments
Open
1 of 2 tasks
Labels
feature-request Request for new features or functionality tokenization

Comments

@princefishthrower
Copy link

Reproducible in vscode.dev or in VS Code Desktop?

  • Not reproducible in vscode.dev or VS Code Desktop

Reproducible in the monaco editor playground?

Monaco Editor Playground Code

monaco.editor.create(document.getElementById('container'), {
    value: "function hello() {\n\talert('Hello world!');\n}",
    language: 'typescript'
});

const tokens = monaco.editor.tokenize(`// My awesome TypeScript function!
export const areEqual = (a: number, b: number): boolean => {
    return a === b;
}`, "typescript");

console.log("tokens", tokens);

const tokensEscaped = monaco.editor.tokenize("// My awesome TypeScript function!\nexport const areEqual = (a: number, b: number): boolean => {\n\treturn a === b;\n}", "typescript");

console.log("tokensEscaped", tokensEscaped);

Reproduction Steps

  1. Try to pass a preformatted string to monaco.editor.tokenize
  2. Log out the results via console.log

Actual (Problematic) Behavior

Tokens are not detected (though lines are)

Expected Behavior

Tokens to be detected!

Additional Context

The strange thing is that if you paste the following into the terminal, it works:

tokens = monaco.editor.tokenize(`// My awesome TypeScript function!
    export const areEqual = (a: number, b: number): boolean => {
        return a === b;
    }`, "typescript");
console.log("tokens", tokens);
tokens (4) [Array(1), Array(27), Array(9), Array(1)]

I feel like this is some sort of initialization issue... but calling monaco.editor.tokenize even multiple times within my code doesn't seem to solve the problem, I still get empty token lines... very strange.

As you see I thought it was an escaping of \n and \t issue, but even the escaped single line string doesn't work....

Perhaps this is something small, but I can't see what is going wrong here. Need another set of eyes.

@princefishthrower
Copy link
Author

princefishthrower commented Dec 5, 2022

I looked into this further, this is almost definitely an initialization or asynchronous related bug. When I do this:

setTimeout(() => {
      const tokens = monaco.editor.tokenize(`// My awesome TypeScript function!
  export const areEqual = (a: number, b: number): boolean => {
      return a === b;
  }`, "typescript");
  
  console.log("tokens", tokens);
}, 5000)

It works!

Perhaps tokenize should be refactored as an async function and return Promise<Token[][]> ? Or there at least needs to be a way to check when the editor is ready, really I still have no idea what is going on internally to cause this behavior.

@hediet hediet added bug Issue identified by VS Code Team member as probable bug tokenization feature-request Request for new features or functionality and removed bug Issue identified by VS Code Team member as probable bug labels Dec 12, 2022
@hediet
Copy link
Member

hediet commented Dec 12, 2022

You can do this before calling tokenize:

await languages.TokenizationRegistry.getOrCreate(languageId);

@hediet hediet closed this as completed Dec 12, 2022
@princefishthrower
Copy link
Author

Closed

I can't find this anywhere in the Monaco documentation, nor is TokenizationRegistry a property on the TypeScript types for languages:

import { languages } from "monaco-editor/esm/vs/editor/editor.api";
// ...
await languages.TokenizationRegistry.getOrCreate("typescript");

Property 'TokenizationRegistry' does not exist on type 'typeof languages'.ts(2339)

Unless I'm mistaken and this languages is coming from some other library?

Therefore I would like a bit more clarification on this issue before it is closed.

@hediet
Copy link
Member

hediet commented Dec 14, 2022

Ah, it is only exposed by vs/editor/common/languages.

@hediet hediet reopened this Dec 14, 2022
@princefishthrower
Copy link
Author

@hediet - is this a separate GitHub package that can be installed? Is it compatible with the browser?

@princefishthrower
Copy link
Author

princefishthrower commented Dec 19, 2022

For anyone who may find this later, I currently have a fairly reliable workaround. essentially I access the global window.monaco object within a setTimeout and call the tokenize function. While this one usually fails as described above, all subsequent tokenize calls work as expected.

In code, the workaround looks like this:

setTimeout(() => {
  (window as any).monaco.editor.tokenize(
    `export const dummyFunction = () => {
  console.log('hello world')
}`,
    "typescript"
  );
  // any call to (window as any).monaco.editor.tokenize beyond this point works
}, 1000);

You can alternatively promisify this setTimeout and do something like:

const initializeMonaco = new Promise((res) =>
    setTimeout(() => {
      (window as any).monaco.editor.tokenize(
        `export const dummy = () => {
    console.log('hello world')
  }`,
        "typescript"
      );
      res("");
    }, 1000)
  );
  await initializeMonaco;
// now (window as any).monaco.editor.tokenize() will work

obviously, I'd like to not use this hack, but it gets the job done for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality tokenization
Projects
None yet
Development

No branches or pull requests

2 participants