Skip to content
This repository has been archived by the owner on Oct 10, 2018. It is now read-only.

Optimize and cleanup regex usage #337

Merged
merged 4 commits into from Nov 6, 2017

Conversation

tdd
Copy link
Contributor

@tdd tdd commented Nov 4, 2017

Description

As announced in #335, here's my take on cleaning up and optimizing regex usage in src/. As soon as the PR passes (I'm still victim to local errors that don't seem to happen in CI, so checking…) I'll add comments throughout the PR changes view to clarify the "why" of the various refactorings.

This does not fix anything, mind you. It just provides more idiomatic, and probably slightly more performant, regex usage throughout the code.

Cheers!

@tdd
Copy link
Contributor Author

tdd commented Nov 4, 2017

Dangit, lint fail 😊

Copy link
Contributor Author

@tdd tdd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with the detailed rationales for every regex tweak.

@@ -60,7 +60,7 @@ export function getAbsolutLibraryName(library: string, actualFilePath: string, r
return '/' + toPosix(relative(
rootPath,
normalize(join(parse(actualFilePath).dir, library)),
)).replace(/[/]$/g, '');
)).replace(/\/$/, '');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Escaping is idiomatically done with backslashes (you usually do, btw)
  • You seem to be putting the g flag everywhere, but 99% of the time you don't need it, and it entails more overhead / different return semantics / sticky behavior.

@@ -160,7 +160,7 @@ export async function findFiles(config: ExtensionConfig, workspaceFolder: Worksp
o.filter(
f => f.fsPath
.replace(rootPath || '', '')
.split(/\\|\//)
.split(/[\\/]/)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatives are costly, classes are preferable for single chars (you've done that most elsewhere, btw)

@@ -1,5 +1,7 @@
import { Position, TextEditor } from 'vscode';

const REGEX_IGNORED_LINE = /^\s*(?:\/\/|\/\*\*|\*\/|\*|(['"])use strict\1)/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Ensuring quote match the usual way (group backreference)
  • Switching to non-capturing group for better perf (as we won't be needing the group capture)
  • Stripping useless costly g flag

@@ -13,6 +15,6 @@ export function getImportInsertPosition(editor: TextEditor | undefined): Positio
return new Position(0, 0);
}
const lines = editor.document.getText().split('\n');
const index = lines.findIndex(line => !line.match(/^\s*(\/\/|\/\*\*|\*\/|\*|['"]use strict['"])/g));
const index = lines.findIndex(line => !REGEX_IGNORED_LINE.test(line));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • regex.test(x) perfs better than x.match(regex) when all we need is a yes/no

@@ -9,6 +9,10 @@ import { Logger } from '../utilities/winstonLogger';
import { ImplementPolymorphElements, NoopCodeAction } from './CodeAction';
import { CodeActionCreator } from './CodeActionCreator';

const REGEX_GENERICS = /^(.*)<(.*)>$/;
const REGEX_INCORRECT_IMPL = /class (['"])(.*)\1 incorrectly implements.*(['"])(.*)\3\./i;
const REGEX_NON_ABSTRACT_IMPL = /non-abstract class (['"])(.*)\1.*implement inherited.*from class (['"])(.*)\3\./i;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • < and > are not special-chars by themselves: removing the class-delimiter escapes
  • Ensuring quote match the usual way (group backreference)
  • Stripping useless costly g flags (also, when re-using the same Regexp instances because of extraction, they'd result in a partially-sticky matching behavior we don't want)

@@ -57,7 +59,7 @@ export class MissingImportCreator extends CodeActionCreator {
return commands;
}

const infos = index.declarationInfos.filter(o => o.declaration.name === match[1]);
const infos = index.declarationInfos.filter(o => o.declaration.name === match[2]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Because we used an extra group for quote pairing, 1 becomes 2.

@@ -21,6 +21,8 @@ import { getItemKind } from '../utilities/utilityFunctions';
import { Logger } from '../utilities/winstonLogger';
import { BaseExtension } from './BaseExtension';

const REGEX_COMMENT = /^\s*(?:\/\/|\/\*\*|\*\/|\*)/;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Switching to non-capturing group for better perf (as we won't be needing the group capture)
  • Stripping useless costly g flag

lineText.startsWith('import ') ||
lineText.substring(0, position.character).match(new RegExp(`(\w*[.])+${searchWord}`, 'g'))) {
new RegExp(`(?:\w*\.)+${searchWord}`).test(lineText.substring(0, position.character))) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Escaping is idiomatically done with backslashes (you usually do, btw)
  • Switching to non-capturing group for better perf (as we won't be needing the group capture)
  • Stripping useless costly g flag
  • regex.test(x) perfs better than x.match(regex) when all we need is a yes/no

@@ -11,10 +11,12 @@ import { RemainImportGroup } from './RemainImportGroup';
*/
export type ImportGroupSetting = string | { identifier: string, order: ImportGroupOrder };

const REGEX_REGEX_GROUP = /^\/.+\/$/;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • .+.* looks like a RegexDOS invite. There's no need at all to add .* at the end here.
  • Stripping useless costly g flag

@@ -53,7 +53,8 @@ describe('OrganizeImportsOnSaveExtension', () => {
});
});

it('should remove an unused import on save', async () => {
it('should remove an unused import on save', async function () {
this.timeout(4000);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests consistently take more than 2s on my 2013 MBA to run, resulting in inaccurate test failures. I took the liberty of allowing them up to 4s to run, which solved the issue for me. These then fail locally (grmbl) but work in CI, so…

@buehler
Copy link
Owner

buehler commented Nov 5, 2017

@tdd Just wow. Thank you! Really learnt some stuff with this pr :-)

@buehler buehler merged commit 06b739c into buehler:develop Nov 6, 2017
@tdd tdd deleted the regex-grooming branch November 6, 2017 15:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants