LESS Refactoring - add LanguageManager #2844

Merged
merged 57 commits into from Feb 27, 2013

Conversation

Projects
None yet
3 participants
Contributor

DennisKehrig commented Feb 11, 2013

Submitting this as a pull request so it can get tracked the usual way :)

@ghost ghost assigned peterflynn Feb 12, 2013

Contributor

DennisKehrig commented Feb 13, 2013

Rebased once more. We should merge this early in a sprint to catch possible mistakes I made when merging.

src/document/DocumentManager.js
+ if (file.fullPath === newName) {
+ _this.language = null;
+ }
+ });
@peterflynn

peterflynn Feb 20, 2013

Member

This introduces a memory leak -- $(module.exports) will keep references to every Document ever created. Why not do this down in the existing notifyPathNameChanged() in the existing code that loops over _openDocuments?

@peterflynn

peterflynn Feb 20, 2013

Member

Also, we should document that all this is doing is throwing away a cache -- language will lazily get set back to something valid as soon as it's asked for.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Good catch!

@peterflynn

peterflynn Feb 27, 2013

Member

The memory leak issue (now a TODO in the code) is spun off as #2961

+
+ /**
+ * The Language for this document. Will be resolved by file extension in the constructor
+ * @type {!Language}
@peterflynn

peterflynn Feb 20, 2013

Member

Should really be {?Language} given that this remains null until the first call to getLanguage(). Perhaps it'd be cleaner to not initialize this lazily though -- have something like a _updateLanguage() function that's called in the ctor and upon rename, and then have a dumb getter that merely returns the field.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Something like _updateLanguage() feels better, I agree. Having things break early is easier to debug.

src/document/DocumentManager.js
+ if (doc && doc.file.fullPath === newName) {
+ closeFullEditor(doc.file);
+ setCurrentDocument(doc);
+ }
@peterflynn

peterflynn Feb 20, 2013

Member

This will blow away the selection, scroll position, any inline editors, etc. If you add a check so that we only do this if the language has changed, then we could probably get away with it for now... but even then, please file a spinoff bug to track the issue for later. Eventually we should update the Editor's mode more directly and expect others like JSLint to listen for renames or language changes or whatnot themselves.

@peterflynn

peterflynn Feb 20, 2013

Member

Also... doesn't this still need to be done for Editors that aren't visible also? And secondary inline editors? Otherwise, when will their mode get corrected? (Seems like a bug in the old EditorManager-based code too -- although in the current UI, you can't rename files other than the current one that's a dangerous assumption and the code still breaks if any inlines are open for the same file).

Maybe it's best to do some partial cleanup now -- e.g. have Document dispatch a "languageChanged" event and have every Editor listen to its Document and update its own mode accordingly. (And JSLint et al could be treated as a lower-priority bug, since I think stuff like that wan't handled by the old code either).

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Re: "expect others like JSLint to listen for renames or language changes"
This isn't the first time that I wish I could easily express that "the code below is based on the current value of doc.language, run it again when that changes" - i.e. write observing(doc, "language", function (value, previousValue) { ... }), calling the callback once right away and again later when something changes. In many cases the code would be the same for initialization and updates, meaning users wouldn't have to think much about this at all.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

I suppose I'm reinventing Cocoa's bindings here.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

I filed #2911 and #2913 accordingly.

src/editor/Editor.js
@@ -1191,7 +1191,7 @@ define(function (require, exports, module) {
* an *approximation* of whether the mode is consistent across the whole range (a pattern like
* A-B-A would return A as the mode, not null).
*
- * @return {?(Object|String)} Object or Name of syntax-highlighting mode; see {@link EditorUtils#getModeFromFileExtension()}.
+ * @return {?(Object|String)} Object or Name of syntax-highlighting mode; {@link Languages#getLanguageFromFileExtension()} and {@link Language.mode}.
@peterflynn

peterflynn Feb 20, 2013

Member

Nit: use "#" instead of "." in the second link (and same with the other docs change below).

@@ -730,7 +721,6 @@ define(function (require, exports, module) {
CommandManager.register(Strings.CMD_LINE_UP, Commands.EDIT_LINE_UP, moveLineUp);
CommandManager.register(Strings.CMD_LINE_DOWN, Commands.EDIT_LINE_DOWN, moveLineDown);
CommandManager.register(Strings.CMD_SELECT_LINE, Commands.EDIT_SELECT_LINE, selectLine);
-
@peterflynn

peterflynn Feb 20, 2013

Member

I think this space was intentional, separating commands that are just proxies for CM behavior vs. commands whose implementations actually live here... not a big deal either way though.

@DennisKehrig

DennisKehrig Feb 21, 2013

Contributor

It's back in... wondering why this isn't marked outdated.

src/editor/EditorCommandHandlers.js
@@ -75,11 +75,10 @@ define(function (require, exports, module) {
* and cursor position. Applies to currently focused Editor.
*
* If all non-whitespace lines are already commented out, then we uncomment; otherwise we comment
- * out. Commenting out adds "//" to at column 0 of every line. Uncommenting removes the first "//"
+ * out. Commenting out adds prefix to at column 0 of every line. Uncommenting removes the first prefix
@peterflynn

peterflynn Feb 20, 2013

Member

While you're in here, would you mind removing "to" to fix the existing grammar bug?

src/editor/EditorCommandHandlers.js
@@ -209,7 +208,7 @@ define(function (require, exports, module) {
* @param {!String} suffix
* @param {boolean=} slashComment - true if the mode also supports "//" comments
@peterflynn

peterflynn Feb 20, 2013

Member

Docs for the changed arg need updating

src/editor/EditorManager.js
@@ -104,7 +103,7 @@ define(function (require, exports, module) {
/**
* Creates a new Editor bound to the given Document. The editor's mode is inferred based on the
@peterflynn

peterflynn Feb 20, 2013

Member

"Inferred" might be the wrong word now... maybe just "set"? (Ditto for createInlineEditorForDocument() below)

@@ -0,0 +1,41 @@
+/*
+ * Copyright (c) 2012 Adobe Systems Incorporated. All rights reserved.
@peterflynn

peterflynn Feb 20, 2013

Member

Should this be 2013, or was it already pushed up in December?

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Very thorough! It was, however, created on December 11th, 2012.

src/utils/ExtensionLoader.js
+ context([entryPoint], onLoad);
+ }
+ }
+ });
@peterflynn

peterflynn Feb 20, 2013

Member

I don't think I understand what this block of code does... could you explain it to me & also add documentation? :-)

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Sure :) It allows extensions to do this:
require("extension!LESSSupport")

src/utils/ExtensionLoader.js
};
+ define("extension", {
+ load: function requireExtension(name, req, onLoad, config) {
+ var context = contexts[name], entryPoint = entryPoints[name];
@peterflynn

peterflynn Feb 20, 2013

Member

Nit: I think we usually keep initializers on separate lines from each other

src/utils/ExtensionLoader.js
@@ -40,6 +40,7 @@ define(function (require, exports, module) {
var _init = false,
contexts = {},
+ entryPoints = {},
@peterflynn

peterflynn Feb 20, 2013

Member

Could you document the type of this map (e.g. {Object<string, ???>}) and the contexts one?

src/brackets.js
@@ -28,7 +28,8 @@
require.config({
paths: {
"text" : "thirdparty/text",
- "i18n" : "thirdparty/i18n"
+ "i18n" : "thirdparty/i18n",
+ "mode" : "thirdparty/CodeMirror2/mode"
@peterflynn

peterflynn Feb 20, 2013

Member

This could use some explanatory docs too. The other two items here are related to Require plugins -- things you use via a "!" prefix expression. That's not true of "mode", so it's potentially confusing. Am I correct that this is just creating a "path variable" of sorts that extensions can use to get at CM modules that we ship with but don't load by default? And that this isn't merely a convenient shorthand, but the only way that such extensions could point to these modules reliably?

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

This is used by the LESS extension - require("mode/less/less")
mode is an alias to the CodeMirror mode directory so extensions are independent of where CodeMirror is located, we could arbitrarily adjust this later by turning mode into a plugin. By allowing extensions to require a mode this way (rather than calling some loadBuiltinMode function as a I did earlier) a language can be fully defined by the time the language extension is done loading, rather than making this asynchronous.

There are other ways to load modes, though. For instance require("text/../CodeMirror2/mode/...") could work :)

src/utils/ExtensionUtils.js
@@ -23,7 +23,7 @@
/*jslint vars: true, plusplus: true, devel: true, nomen: true, indent: 4, maxerr: 50 */
-/*global define, $, brackets, less */
+/*global define, $, brackets, CodeMirror, less */
@peterflynn

peterflynn Feb 20, 2013

Member

This change seems unnecessary

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Thank you, that was left over from the loadBuiltinMode function. Would be nice if JSLint complained about unnecessary definitions here...

src/language/Languages.js
+ /**
+ * Defines a language.
+ *
+ * @param {!string} id Unique identifier for this language, use only letter a-z (i.e. "cpp")
@peterflynn

peterflynn Feb 20, 2013

Member

Why not allow dots, for package-style naming? That will help ensure uniqueness...

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

I like that!

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

I chose the underscore instead of the dot, though. Causes less trouble when used in file names.

src/language/Languages.js
+ // Public methods
+ module.exports = {
+ defineLanguage: defineLanguage,
+ getLanguage: getLanguage,
@peterflynn

peterflynn Feb 20, 2013

Member

Seems like we don't need to expose this? I don't see calls to it anywhere...

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

It's pretty much the basis for actually extending a language in some fashion. For instance when an extension wants to make XML support more complete by setting comment styles, it needs to access the already defined language somehow.

src/language/Languages.js
+ }
+
+ var mode = definition.mode, mimeMode = definition.mimeMode, modeAliases = definition.modeAliases;
+ if (mode) {
@peterflynn

peterflynn Feb 20, 2013

Member

Perhaps mode shouldn't be optional -- it's basically always a mistake if it's missing. We could have "unknown" provide mode: "" just like EditorUtils used to.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Not sure what the benefit would be. Doesn't undefined nail it exactly? Do we somehow want to make sure we can always call language.mode.length or something like that?

@peterflynn

peterflynn Feb 26, 2013

Member

The benefit would be anyone who screws up their API usage and forgets to specify a mode (or has a typo in the property name, etc.) would see an explicit error instead of just having the experience that their code doesn't work. It feels weird that most of the other settings are so strictly validated, but you can leave off the mode field and we'll just silently go with it.

src/language/Languages.js
+ * @param {Array.<string>} definition.blockComment Array with two entries defining the block comment prefix and suffix (i.e. ["<!--", "-->"])
+ * @param {string} definition.lineComment Line comment prefix (i.e. "//")
+ * @param {string} definition.mode Low level CodeMirror mode for this language (i.e. "clike")
+ * @param {string} definition.mimeMode High level CodeMirror mode or MIME mode for this language (i.e. "text/x-c++src")
@peterflynn

peterflynn Feb 20, 2013

Member

We should specify that this is optional, and that while it takes precedence over 'mode' you still need to specify the mode as well... and it must match the mode that is registered to that mimetype. Which actually seems a little error-prone -- I wonder if there's a way we could get the mode from CM automatically so that people could specify only mimetype alone?

@peterflynn

peterflynn Feb 20, 2013

Member

We should explain that the mode needs to either be explicitly require()'ed by the caller, or be a mode that ships with CodeMirror (& thus Brackets) by default (or optionally both).

@peterflynn

peterflynn Feb 20, 2013

Member

The comment on mimeMode sort of implies it could be something other than a mimetype -- "High level CodeMirror mode OR...". I'm guessing that's referring to the { name: ... } construct that CM accepts? But I think we don't actually support anything other than a string mimetype.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

We can probably get the mode for a MIME mode once it's loaded, but it still has to be explicitly defined so we know what file to load in the first place. I now changed this to one single mime setting that can optionally be an array (["clike", "text/x-java"]).

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

True, the documentation should be greatly enhanced here.
The "or" was badly chosen, I had a hard time describing what I meant, and went for two different descriptions of the same ting ("high level mode", "MIME mode").

src/language/Languages.js
+ language._setMode(mimeMode || mode, modeAliases);
+ };
+
+ if (_hasMode(mode)) {
@peterflynn

peterflynn Feb 20, 2013

Member

This function checks for both mode names and mimetypes, but we only ever pass in the mode name. It seems like checking only mode name is enough, so maybe _hasMode() could be simplified?

Also, could we rename it to make it clearer -- it could be read as telling whether we (Languages.js) "have" the language already registered, not whether CM has it loaded & registered...

src/language/Languages.js
+ * @param {!string} description A helpful identifier for value
+ */
+ function _validateString(value, description) {
+ if (toString.call(value) !== '[object String]') {
@peterflynn

peterflynn Feb 20, 2013

Member

Nit: should use double quotes.

Also, I've never seen a bare 'toString' referenced before... is this the came as 'Object.prototype.toString'?

src/language/Languages.js
+ Language.prototype._setMode = function (mode, modeAliases) {
+ var i;
+
+ _validateMode(mode, "mode");
@peterflynn

peterflynn Feb 20, 2013

Member

I think if "mode" were actually not a string, we could have already errored out back in defineLanguage()... maybe the validation should be done there instead? It also seems redundant to double-check that the mode is loaded when this is a private method called only by code that has just done exactly that.

src/language/Languages.js
+ * @param {!string} id Unique identifier for this language, use only letter a-z (i.e. "cpp")
+ * @param {!Object} definition An object describing the language
+ * @param {!string} definition.name Human-readable name of the language, as it's commonly referred to (i.e. "C++")
+ * @param {Array.<string>} definition.fileExtensions List of file extensions used by this language (i.e. ["php", "php3"])
@peterflynn

peterflynn Feb 20, 2013

Member

We should either document that these must be all-lowercase, or explicitly convert them ourselves.

@DennisKehrig

DennisKehrig Feb 20, 2013

Contributor

Very good catch.

src/language/Languages.js
+ * @return {Language} The language for the provided mode or the fallback language
+ */
+ function getLanguageForMode(mode) {
+ var i, modes = _modeMap[mode];
@peterflynn

peterflynn Feb 20, 2013

Member

'modes' seems like the wrong var name here... this is an array of Languages, isn't it?

src/language/Languages.js
+ var _fallbackLanguage = null,
+ _languages = {},
+ _fileExtensionsMap = {},
+ _modeMap = {};
@peterflynn

peterflynn Feb 20, 2013

Member

There seems (to me) to be a lot of complexity around the hierarchy of precedence when mapping modes to languages. We have this global 'mode -> (list of Languages)' map, and a similar map within each Language, plus a Language.modeAliases array. And there are several ways of asking for what Language goes with a given mode.

Not sure I have my head wrapped around it all, but with that caveat I have a few suggestions for simplifying things:

It seems like some complexity stems from trying to make the "alias" option as general & flexible as possible, and I think that might be overkill. The one place we use an alias today is really not a case of two strings that mean the same mode...it's actually two different modes that we want to present in the UI as the same thing: an inner mode ("html") vs. an outer mixed mode ("htmlmixed"). It might actually be bug-prone/confusing to conflate the two concepts too much, and it might also clean up the code somewhat if we can do something specific to inner languages rather than a generic "aliases" mechanism...

  • The per-Language map is not used as far as I can tell -- Language._setLanguageForMode() isn't called anywhere. How about we remove this part of it until we have a strong need for another layer of precedence?
  • Rename 'modeAliases' to something like 'modeWhenInner' -- and perhaps make it a singleton instead of an array. (It's hard to picture a single user-concept "language" having more than one outer or more than one inner mode. Certainly various different outer languages could use the same other language as an inner mode -- e.g. "htmlmixed" and "php" both use "html" -- but that's not what this "alias" functionality is needed for).
  • Simplify the global getLanguageForMode() to just return the first language in the array. It's not clear to me that there's any benefit to preferring outer-mode languages, or even what it would mean to have a second language that maps to the same inner mode but not the same outer mode. (And the global _setLanguageForMode() will warn on mode-usage collisions for both outer and inner mappings, which seems appropriate to me since both seem equally like a problem).
  • Keep Language.getLanguageForMode() similar to today (preferring self before calling the global function -- although this seems to me like graceful error handling rather than functionality with a real use case). We'd lose the this._modeMap check due to the first bullet above though, and we could potentially fold usesMode() into this function since no one else would be calling it at this point.
  • Or if we keep usesMode() separate, let's rename it for clarity -- htmlmixed "uses" lots of inner modes, but this function only returns true one of them (html). Really, it's asking if a mode that maps to this exact language -- so perhaps isRepresentedByMode() or something like that...
@DennisKehrig

DennisKehrig Feb 21, 2013

Contributor

When I discovered this weird "html" mode in addition to "htmlmixed", I wanted to allow a language to have multiple modes. But then mode validation didn't work anymore because "html" isn't really a mode. It just seemed to be a weird internal alias, and I thought there might be more of them.

But as it turns out, we invent this mode in TokenUtils ourselves. Where we normally would just report "xml" as the mode name, we make a distinction and instead return "html". A bunch of places rely on this madeup mode. I think we can get around this now, though.

For instance, when registering for code hints, the registration should be for a language, not a mode. Likewise, the CSSInlineEditor makes sure it only opens for HTML files - another decision that should be made based on the language, not the mode.

HTMLUtils also checks the the mode is "html", but why? Because it deals with tags, something that works the same way in XML.

So we could remove the XML distinction in TokenUtils, let it just return "xml", which is an actual mode, and map "xml" to HTML for our HTML language object - exactly how this was intended. That is, if it's safe to assume that XML inside HTML always actually is HTML. It is possible to embed XML in XHTML, but I doubt they make this distinction, and I certainly would still want this to be considered HTML.

As a result, the reported language would correctly be HTML despite the mode being "xml", and if the CSSInlineEditor and the hint providers were to check for that instead of the mode, they are good. And HTMLUtils should be fine checking for the xml mode instead - getTagAttributes() and getTagInfo() don't sound HTML-specific at all.

Then we could get rid of mode aliases entirely, which is a huge relief indeed.

In the mean time, so we don't have to refactor CSSInlineEditor, etc. to use the language API, we could just change the returned mode to "htmlmixed" instead of "html" or hard-code this distinction into Languages.js for now. The latter sounds better, as it will avoid issues with extensions that register something for "html". Later we could actually treat this as a language ID rather than a mode name, so extensions wouldn't even have to change where mode names match language IDs.

Sound good?

@peterflynn

peterflynn Feb 27, 2013

Member

I've spun this off as #2965. The current solution seems good enough for now.

Member

peterflynn commented Feb 20, 2013

I'm unable to run unit tests from your branch. There are a couple of problems:

  • Editor-test has one broken suite that depends on EditorUtils, and it prevents the whole test-runner from loading since it can't find that module. If you fix or comment that out, then you can see the other issues...
  • Lots of tests fail -- e.g. about 1/2 of EditorCommandHandlers-test, most of CSSUtils-test, one more test in Editor-test, etc.
  • The console gets a big spew of errors about being unable to load various CM modes. I think your change in brackets.js may need to go into SpecRunner.js also. @jasonsanjose probably understands how Require is affected by unit tests the best, though. I'm guessing fixing this will fix a bunch of the test failures too...
Member

peterflynn commented Feb 20, 2013

Done reviewing. Mostly minor changes, I hope, except for the big Languages.js comment and possibly some of the suggestions in DocumentManager...

Great work! I'm very psyched about getting this into Sprint 21...

DennisKehrig added some commits Dec 11, 2012

LESS extension: reloading the editors is no longer necessary since ex…
…tensions are now loaded before the project is restored
Added the Languages module to have a centralized place for adding new…
… languages

The LESS extension now uses this exclusively (after manually loading the CodeMirror mode)
Redesigned the language API with a fluent interface to allow for late…
…r refinement of language definitions.

Setting a CodeMirror mode is now optional.
Also, in alignment with the rest of the Brackets API, there is no explicit mention of CodeMirror anymore since we directly refer to modes as such.
Documents and editor now provide more direct APIs to access the used …
…language.

This also allows overriding the language used in a document. Furthermore, that language is used to disambiguate the language that belongs to a submode.
Finally, the status bar entry that displays the language name now uses the Language API.
Added a default language with ID "unknown" so that documents always h…
…ave a language.

Also refactored support for HTML, JavaScript, CSS and LESS to Languages.js (to the extent the language API allows).
Removed Editor.setModeForDocument - when renaming the current documen…
…t, the full editor will be re-opened, thereby updating more than just its mode
Restrict refining languages to setting the mode
Let Languages.js return the default language, not Document.getLanguage
Define the default languages in Languages.js instead of languages.json so its always available right away
Move loadBuiltinMode from ExtensionUtils.js to Languages.js

DennisKehrig added some commits Feb 21, 2013

Renamed language/Languages.js to languages/LanguageManager.js
Added top-level documentation
Moved all mode-loading code to _setMode
Added promise modeReady
Added documentation for the mode parameter
Added a check to make sure that only MIME modes defined by the given mode are used
Changed the mode parameter to either take a string or an array of two strings - i.e. "mode" or ["mode", "mimeMode"]
Removed the require.js "mode" alias to the CodeMirror mode directory - it's enough that LanguageManager loads these modes and it might cause conflicts if a language extension decides to add a mode.js
Renamed cs to csharp
Removed language aliases because the special "html" case was actually…
… artificially introduced by us via TokenUtils.getModeAt and can be removed once we use the language API in more places
Contributor

DennisKehrig commented Feb 21, 2013

Rebased on master, fixed the tests and went through all your suggestions.
Thanks for a tremendous job, @peterflynn! You see a lot.

Some of the more important changes:

  • Languages.js is now LanguageManager.js
  • MIME modes are used like this now: mode: ["javascript", "application/json"]
  • Mode aliases are gone
  • The "extension" and "mode" plugins for require.js are gone
  • There's a language.modeReady promise that resolves when the mode is loaded (true even for already loaded modes)
Contributor

DennisKehrig commented Feb 21, 2013

@jasonsanjose I hope you didn't attempt creating more unit tests yet, it only really makes sense now because I completely forgot to take care of them before today. Sorry!

Member

jasonsanjose commented Feb 21, 2013

I didn't start work on unit tests yet. I can review how much more coverage is needed after this pull request lands.

src/editor/Editor.js
+ * Responds to language changes, for instance when the file extension is changed.
+ */
+ Editor.prototype._handleDocumentLanguageChanged = function (event) {
+ var mode = this._getModeFromDocument();
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

unused var

src/editor/Editor.js
@@ -1195,7 +1209,7 @@ define(function (require, exports, module) {
*
* @return {?(Object|string)} Name of syntax-highlighting mode, or object containing a "name" property
* naming the mode along with configuration options required by the mode.
- * See {@link EditorUtils#getModeFromFileExtension()}.
+ * See {@link Languages#getLanguageFromFileExtension()} and {@link Language#mode}.
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

Should be LanguageManager.getLanguageForFileExtension() ?

src/editor/Editor.js
/**
* Gets the syntax-highlighting mode for the document.
*
- * @return {Object|String} Object or Name of syntax-highlighting mode; see {@link EditorUtils#getModeFromFileExtension()}.
+ * @return {Object|String} Object or Name of syntax-highlighting mode; see {@link Languages#getLanguageFromFileExtension()} and {@link Language#mode}.
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

Should be LanguageManager.getLanguageForFileExtension() ?

* @param {!jQueryObject} container Container to add the editor to.
* @param {{startLine: number, endLine: number}=} range If specified, range of lines within the document
* to display in this editor. Inclusive.
*/
- function Editor(document, makeMasterEditor, mode, container, range) {
+ function Editor(document, makeMasterEditor, container, range) {
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

I don't believe that existing extensions will call this constructor, but we should address this API change.

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

Address how?

@jasonsanjose

jasonsanjose Feb 25, 2013

Member

Good question. :) @peterflynn maybe you can also chime in. Since we can't overload the constructor, maybe we can just leave the argument in and change line 312 to:

mode = this._getModeFromDocument() || mode;

Not the cleanest idea but at the moment I'm drawing a blank.

@peterflynn

peterflynn Feb 26, 2013

Member

In this case, I suggest we just make the change and document it as a breaking API change in the release notes. It's not really that supported to create Editors without going through EditorManager anyway, and I suspect there aren't any extensions doing so...

+ * language.setLineComment("--");
+ * language.setBlockComment("{-", "-}");
+ *
+ * Some CodeMirror modes define variations of themselves. The are called MIME modes.
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

Typo. "They are called..."

+ * Defines a language.
+ *
+ * @param {!string} id Unique identifier for this language, use only letters a-z and _ inbetween (i.e. "cpp", "foo_bar")
+ * @param {!Object} definition An object describing the language
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

Not sure about this syntax where the params are properties of definition

@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Perfect explanation. Thanks!

src/file/FileUtils.js
@@ -248,16 +248,32 @@ define(function (require, exports, module) {
}
/**
+ * Checks wheter a path is affected by a rename operation.
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

Typo "whether"

+ "fileExtensions": ["java"]
+ },
+
+ "coffeescript": {
@jasonsanjose

jasonsanjose Feb 21, 2013

Member

This set of default languages seems odd. We have a lot of non-web files here like C# and Java. Then we also have coffeescript and SASS which fall into the same preprocessed bucket as LESS. Should we limit this default set to our target languages (html, js, css) and let extensions provide the rest? Maybe just commenting these out would be fine or even creating an extension for all CodeMirror supported modes?

@DennisKehrig

DennisKehrig Feb 21, 2013

Contributor

They were previously supported to the same extent, so I wouldn't just comment them out without an immediate replacement. I suppose creating separate extensions for each of them would be an excellent starting point for people to flesh out support for these languages (i.e. by adding the comment styles), however we should wait with this until we have a better extension management system.
I would like for users to be able to open a language for the first time and with the click of a button download and install an extension that adds support for this language (rather than shipping all these individual extensions by default).

@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Understood

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

I asked Adam about where to log this task.

@peterflynn

peterflynn Feb 26, 2013

Member

I'll take an action to file a starter bug about doing this cleanup -- doesn't feel big enough to need a full Trello card IMHO.

@peterflynn

peterflynn Feb 26, 2013

Member

Oh actually, just noticed the bit about auto-discovery. Maybe we do need a Trello card so we can have a broader discussion: about that idea, about which languages should be in core, and about where the other extensions should live (do we maintain them so that they're "official"?).

@peterflynn

peterflynn Feb 27, 2013

Member

I spun off #2969 for this

@@ -733,7 +724,7 @@ define(function (require, exports, module) {
CommandManager.register(Strings.CMD_LINE_UP, Commands.EDIT_LINE_UP, moveLineUp);
CommandManager.register(Strings.CMD_LINE_DOWN, Commands.EDIT_LINE_DOWN, moveLineDown);
CommandManager.register(Strings.CMD_SELECT_LINE, Commands.EDIT_SELECT_LINE, selectLine);
-
+
@DennisKehrig

DennisKehrig Feb 21, 2013

Contributor

Oh, there it is!

// Use unique filename to avoid collissions in open documents list
- var dummyFile = new NativeFileSystem.FileEntry("_unitTestDummyFile_.js");
+ var dummyFile = new NativeFileSystem.FileEntry("_unitTestDummyFile_." + language._fileExtensions[0]);
@DennisKehrig

DennisKehrig Feb 21, 2013

Contributor

I'm wondering whether it's okay to access private fields in test cases, or whether this shouldn't be publicly accessible anyway, but read-only.

@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Yeah, that's fine. We do this already elsewhere in unit tests.

+ html._setLanguageForMode("xml", html);
+
+ // Currently we override the above mentioned "xml" in TokenUtils.getModeAt, instead returning "html".
+ // When the CSSInlineEditor and the hint providers are no longer based on moded, this can be changed.
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Typo "moded". Also, is there a bug filed for this to address it in the future? I'm not entirely clear on this relationship between the html and xml modes.

@peterflynn

peterflynn Feb 26, 2013

Member

This also seems like it hinders extensibility. Any other languages that work similarly can't be fully implemented without changes here and in to TokenUtils.getModeAt(). The TypeScript extension seems like a perfect example -- the "typescript" mode is actually just a reconfigured "javascript" mode, so I bet all its tokens will read as JS without hacking core.

Seems like we should file a bug at the very least. And make sure this is captured in https://github.com/adobe/brackets/wiki/Language-Support.

@peterflynn

peterflynn Feb 26, 2013

Member

Also, would all this hackiness go away if htmlmixed referenced the xml mode via a MIME string ("text/html") instead of a one-off configuration bag ({name: "xml", htmlMode: true})? Maybe we should just submit that as a patch to CodeMirror.

@DennisKehrig

DennisKehrig Feb 26, 2013

Contributor

It would, basically, though we'd need to use "html" as the MIME mode (unless we don't mind changing all the places that now neatly say "html"). The XML mode actually would register a MIME mode "text/html" if that wasn't already taken - by htmlmixed.

TypeScript doesn't have this problem for the reason you state, it simply uses its MIME type.

@peterflynn

peterflynn Feb 26, 2013

Member

I can spin off a bug on this. (Related to the discussion at https://github.com/adobe/brackets/pull/2844/files#r3147341 and #2844 (comment))

@peterflynn

peterflynn Feb 27, 2013

Member

Spun off as #2965

+
+
+ // Public methods
+ module.exports = {
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Our typical convention is to modify the exports object instead of using a new literal.

exports.defineLanguage = defineLanguage;
...
+ // Also, all other modes so far were strings, so we spare us the trouble of allowing
+ // more complex mode values.
+ CodeMirror.defineMIME("text/x-brackets-html", {
+ "name": "htmlmixed",
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Is there a way to support mustache block comments here? {{! comment }}

@peterflynn

peterflynn Feb 26, 2013

Member

@jasonsanjose That seems like a separate feature to me... you'd need a real Mustache mode, which doesn't exist yet (right now it's just tokenized as plain HTML due to the "mode": null), and then we'd have to plumb it through the scriptTypes config below.

src/language/LanguageManager.js
+
+ var language = _fileExtensionsMap[extension];
+ if (language) {
+ console.warn("Cannot register file extension \"" + extension + "\" for " + this.name + ", it already belongs to " + language.name);
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

See error handling notes here https://github.com/adobe/brackets/wiki/Brackets%20Coding%20Conventions. This should probably be an error.

This case is unexpected, but we can recover gracefully since the already mapped language will be used instead. I believe there is a separate user story for allowing the user to choose their own language while a file is open.

src/language/LanguageManager.js
+ */
+ function _setLanguageForMode(mode, language) {
+ if (_modeMap[mode]) {
+ console.warn("CodeMirror mode \"" + mode + "\" is already used by language " + _modeMap[mode].name + ", won't register for " + language.name);
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Should be console.error.

@jasonsanjose

jasonsanjose Feb 23, 2013

Member

Why are modes strictly 1-to-1 with a language? I might want to write a "language" for the WXS windows installer XML build file that uses the XML mode but has specific code hinting for tag names or attr values.

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

That would still work since the document's language would be WXS, as determined by the file extension. Editor than asks that language to resolve a mode to a language, and if a language uses a mode itself, it just returns itself. So globally the xml mode would map to the XML language, but within the WXS language, xml would map to WXS.

I just made LanguageManager.getLanguageForMode private. It is currently only used by Language.getLanguageForMode as a fallback. Right now I don't see a use case for global mode to language mapping, especially since it's easily ambiguous.

@jasonsanjose

jasonsanjose Feb 25, 2013

Member

Changing getLanguageForMode to be private makes sense. Thanks.

@@ -93,6 +96,7 @@ define(function (require, exports, module) {
locale: brackets.getLocale()
});
contexts[name] = extensionRequire;
+ entryPoints[name] = entryPoint;
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Unused var entryPoints

Member

jasonsanjose commented Feb 22, 2013

Just did my own review on top of @peterflynn's since I was about to tackle unit tests.

src/language/LanguageManager.js
+
+
+/*jslint vars: true, plusplus: true, devel: true, nomen: true, indent: 4, maxerr: 50 */
+/*global define, $, brackets, CodeMirror, PathUtils, window */
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

window and brackets are unused

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

I tend to forget to check for these. Do you have a checklist or some tool that finds this? I think JSLint should actually complain about that, but mine doesn't.

@jasonsanjose

jasonsanjose Feb 25, 2013

Member

No tool, I just try to remember to check new files and any new modifications.

@peterflynn

peterflynn Feb 26, 2013

Member

Dennis, might be worth mentioning that idea at http://tech.groups.yahoo.com/group/jslint_com/. Crockford seems pretty responsive to adding new features if he agrees with them.

+ * @private
+ */
+ Language.prototype._addFileExtension = function (extension) {
+ extension = extension.toLowerCase();
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

In the future, it's likely that we'll be allow users to customize what file extensions to associate by default for a language. I don't see anything that would prevent this in the future. It might be worth noting here in a comment though.

@peterflynn

peterflynn Feb 27, 2013

Member

Spun off as #2966

src/language/LanguageManager.js
+ }
+ }
+
+ return this;
@jasonsanjose

jasonsanjose Feb 22, 2013

Member

Return value is unused. Any reason to keep this chainable?

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

I suppose the question is whether we'd want it to be chainable if we make this public eventually. If so, just let it in.
What are your reservations?

@jasonsanjose

jasonsanjose Feb 25, 2013

Member

Typically we haven't implemented chaining, that's all.

+ */
+ Language.prototype._setMode = function (mode) {
+ if (!mode) {
+ return;
@jasonsanjose

jasonsanjose Feb 23, 2013

Member

Should resolve modeReady promise here if no mode is specified.

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

I disagree - no mode actually is ready then.

@jasonsanjose

jasonsanjose Feb 25, 2013

Member

Is there a mechanism to affirm that a language does or does not have a mode? It seems like the modeReady promise exists in both cases.

@peterflynn

peterflynn Feb 26, 2013

Member

You could reject the promise in that case... But I keep coming back to the feeling that we just shouldn't modeless languages. The only use case we have right now is the special case of the default "unknown" language, and there are various ways we could preserve that case without allowing modeless languages in general...

@DennisKehrig

DennisKehrig Feb 26, 2013

Contributor

When it comes to deciding whether not to specify a mode at all or whether to specify it as { mode: "" }, I prefer the former because otherwise we could do the same for comments and all other future missing settings. undefined is very unambiguous.

To me, one case is enough to show that languages don't need to have a mode. Still, the case I had in mind when designing this was Jane Developer creating a new mini-language and her own compiler for that, but not feeling like creating a CodeMirror mode also. Or somebody else wanting to use Brackets with a language where there just doesn't happen to be a CodeMirror mode for yet. OR us supporting languages in the future that use a graphical editor (maybe a UI builder, or an image editor), and not CodeMirror, but still have language-like capabilities, like a "compiler" (image optimizer, sprite generator, image splitter, ...) and live-development support. Specifying a mode there would just be awkward, though we could potentially add a different concept for such files and let "language" continue to refer to text files only.

In the case of Jane Developer we might later want to allow somebody else to write an extension that adds a mode to her language. Previously, there was no mode, now a second extension adds one, and only THEN does modeReady speak the truth if it resolves.

But really, modeReady exists because I wanted the caller of defineLanguage to be able to get the language directly (var language = LanguageManager.defineLanguage(...)), while still being able to wait for the mode to load if necessary. The alternative would be to make defineLanguage always asynchronous, which is starting to seem more appropriate (technically we could return the language if no mode was specified and a promise otherwise, but I think that's too messy). _setMode would consequently also become asynchronous, losing its chaining ability. As Jason pointed out elsewhere, we don't do chaining anyway, so it seems that we should just go the asynchronous route. This allows us to get rid of modeReady.

This discussion raises interesting points, though. Once we make language definitions iterative, there is no clear concept anymore of when a language is fully defined. So code interested in a language might need to specify more than just its ID, in addition it might need to specify what aspects of its definition it depends on. We can't just use APP_READY as the cutoff point because extensions may continue to load asynchronously and we have no way for an extension to declare it's really done with loading (yet). Even then, if we make languages configurable, a user might specify a mode at any given point.

And then there's the question of whether we'd want the mode of a language to be changeable after it has been set once. An extension could offer an improved mode for a language that already has one (but which one would win?), or the user might change which mode to use for a language (maybe not directly, but possibly by changing the extension that adds base support for the language).

Some food for thought.

+ * @param {Array.<string>} definition.fileExtensions List of file extensions used by this language (i.e. ["php", "php3"])
+ * @param {Array.<string>} definition.blockComment Array with two entries defining the block comment prefix and suffix (i.e. ["<!--", "-->"])
+ * @param {string} definition.lineComment Line comment prefix (i.e. "//")
+ * @param {string|Array.<string>} definition.mode CodeMirror mode (i.e. "htmlmixed"), optionally with a MIME mode defined by that mode ["clike", "text/x-c++src"]
@jasonsanjose

jasonsanjose Feb 23, 2013

Member

Need to clarify behavior if no mode is provided.

src/language/LanguageManager.js
+ * @param {!string} description A helpful identifier for value
+ * @param {function(*, !string) validateEntry A function to validate the array's entries with
+ */
+ function _validateArray(value, description, validateEntry) {
@jasonsanjose

jasonsanjose Feb 23, 2013

Member

Unused function. Perhaps this was used to validate in _setMode?

@DennisKehrig

DennisKehrig Feb 25, 2013

Contributor

Ah, thanks! That was used to validate mode aliases.

Contributor

DennisKehrig commented Feb 25, 2013

Hey @jasonsanjose, thanks for the review! I updated the code. There are two things that I commented on that you may not yet be satisfied with.

Member

jasonsanjose commented Feb 25, 2013

Thanks @DennisKehrig. The latest changes are good, but there were additional comments that weren't addressed yet.

@@ -598,6 +599,11 @@ define(function (require, exports, module) {
this.file = file;
this.refreshText(rawText, initialTimestamp);
+ this._updateLanguage();
+ // TODO: remove this listener when the document object is obsolete.
+ // But when is this the case? When _refCount === 0?
@peterflynn

peterflynn Feb 26, 2013

Member

Hmm, yes this is a little tricky given the lack of weak references (/ weak listeners) in JS... The FileEntry will probably be a lot less permanent than the DocumentManager singleton this code used to listen to, but nonetheless we probably still do need to clean up the listener.

How about this -- on the first addRef() we add this listener, and on the last releaseRef() we clean it up. Anyone keeping a Document around for an asynchronous length of time is required to addRef() it, so the listener only matters when the refcount is non-zero.

@peterflynn

peterflynn Feb 27, 2013

Member

Spun off as #2961

+ _validateString(id, "Language ID");
+ // Make sure the ID is a string that can safely be used universally by the computer - as a file name, as an object key, as part of a URL, etc.
+ // Hence we use _ instead of "." since this makes it easier to parse a file name containing a language ID
+ if (!id.match(/^[a-z]+(\.[a-z]+)*$/)) {
@peterflynn

peterflynn Feb 26, 2013

Member

This allows "." but disallows "_", which doesn't match the docs. (I'm still confused by why we are bothering to place any restrictions on this at all, though -- we don't bother for other things like command IDs, and we have no foreseeable need for these things to be valid filenames).

@DennisKehrig

DennisKehrig Feb 27, 2013

Contributor

Using a dot, as in "foo.bar", would...

  • prevent us from doing something like $(LanguageManager).on("foo.barDefined") since jQuery will parse that as the "foo" event, belonging to "barDefined"
  • prevent us from doing something like LanguageManager.languages.foo.bar since JavaScript interprets the dot as separating property name, requiring us to use LanguageManager.languages["foo.bar"]` instead
  • prevent us from doing something like { defines: ["language.foo.bar"] } when adding meta data to extensions since the dot has special meaning there, too (inspired by Kevin's proposals)
  • require developers to remember to escape the dot when using regular expressions with language IDs

An underscore doesn't have any of these problems.

Putting restrictions on this aids in maintaining consistency. Imagine CodeMirror were stricter.

  • Then ALL modes could be loaded via require("thirdparty/CodeMirror2/mode/<name>/<name>"). However, this doesn't work for the two rpm modes since they deviate from this convention.
  • Then we could detect mime modes by checking whether they contain a slash. But we can't, because the gfm mode defines a mime mode named "gfmBase".

Sticking to very basic things can make things simpler and less error prone, and if we manage to make due without dots in variable names, we shouldn't miss them too much in language IDs, either. I just like to be conservative here, I don't think being flexible in the language ID is an awesome feature that makes this API vastly better, nicer, simpler, etc. So I don't think it hurts to be strict here. So why risk closing doors?

+
+ // Define a custom MIME mode here because JSON files must not contain regular expressions
+ // Also, all other modes so far were strings, so we spare us the trouble of allowing
+ // more complex mode values.
@peterflynn

peterflynn Feb 26, 2013

Member

It took me a minute to figure out what the HTML mode had to do with JSON. Maybe maybe this comment more explicit by saying "Define ... here instead of trying to put it in languages.json because..."

@DennisKehrig

DennisKehrig Feb 27, 2013

Contributor

Good point, I totally agree!

Merge branch 'master' into dk/less-refactoring
Conflicts:
	src/file/FileUtils.js
Member

jasonsanjose commented Feb 26, 2013

@peterflynn @DennisKehrig Merged with master, fixed conflict in FileUtils.

Member

peterflynn commented Feb 26, 2013

@jasonsanjose Pushed the updates to fix JS code hints -- turned out to be simple. Want to review my commit real quick?

Member

jasonsanjose commented Feb 26, 2013

JavaScriptCodeHints changes look good

Member

peterflynn commented Feb 27, 2013

@DennisKehrig: There are still a few issues to address here (I think mostly nits) but @jasonsanjose and I agreed it's better to merge this now to get more bake time. Merging now, and then I'll file spinoff bugs assigned to you for the remaining bits.

peterflynn added a commit that referenced this pull request Feb 27, 2013

Merge pull request #2844 from adobe/dk/less-refactoring
Add language extensibility APIs; refactor LESS support out into a default extension using those APIs

@peterflynn peterflynn merged commit fee1311 into master Feb 27, 2013

@jasonsanjose jasonsanjose deleted the dk/less-refactoring branch Feb 27, 2013

jasonsanjose added a commit that referenced this pull request Feb 27, 2013

Member

peterflynn commented Feb 27, 2013

I've spun off all the remaining smaller code review comments into #2968.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment