Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for loading metadata files #223

Merged
merged 6 commits into from
Nov 12, 2018

Conversation

cmyr
Copy link
Contributor

@cmyr cmyr commented Nov 2, 2018

This adds initial support for loading metadata from .tmPreferences
files. The main focus of this patch is loading the patterns used to
determine indentation. I expect to follow up with support for comment
strings, with support for other features possible in the future.

This is all gated behind a new metadata feature. If enabled, a
top-level Metadata object will be included in SyntaxSet; this object
can be queried for the metadata available for a given scope.

This patch does not attempt to implement any actual indentation logic;
that is left up to the consumer.


So I've added a DefaultPackage package to testdata; it's just two files, but maybe we should just put it in the repo with the other packages?

I've generally tried to keep the scope of this as narrow as possible, no bells & whistles; happy to iterate from here.

I've implemented simple autoindent (just using increaseIndentPattern and decreaseIndentPattern) in xi using this, and I'm happy with the results.

Some of the logic around loading is a bit gnarly right now; I was doing a bunch of logging to figure out what all metadata fields existed, etcetera. So let's call this a rough cut?

Copy link
Owner

@trishume trishume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Some comments. Still may want to spend some more time wrapping my head around the approach of converting the files to the final metadata in case I can see a cleaner way.

if !KEYS_WE_USE.contains(&key.as_str()) {
continue;
}
//NOTE: because we can't guarantee the order that files get loaded,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this conflict with the doc comment about last writer winning? I think Sublime just does last writer wins and loads the packages in alphabetical-ish order or something. This seems like it would be pretty unintuitive behaviour.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is out of date and I agree just doing alphabetical order on the files is much simpler, will make that change.


/// The metadata items that match the given scope. The result may be empty.
#[cfg(feature = "metadata")]
pub fn metadata_for_scope(&self, scope: &[Scope]) -> ScopedMetadata {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper doesn't seem like it saves enough code to be worth it to me.

}


impl Pattern {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to say that it would be nice if we could refactor SyntaxDefinition to use this as well, but I see that the fact that regex_str is public would make that a breaking change. I may still want to do that eventually but doesn't have to be this PR.

.map(|score| (score, meta_set))
}).collect::<Vec<_>>();

metadata_matches.sort_by(|one, two| two.0.cmp(&one.0));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hint: sort_by_key makes this cleaner

@cmyr
Copy link
Contributor Author

cmyr commented Nov 6, 2018

@trishume thanks for the quick look. I agree that the initial loading/conversion feels a bit over-complicated, but couldn't think of an easier approach that correctly handles adding new files.

I'll do some cleanup on this a bit later on today.

As an addendum: I'd like to add support for the comment-related members of shellVariables, which I could either add on here or do as a follow-up PR. In any case, I see three basic approaches to this:

  • expose a shellVariables that includes all the keys in the metadata files, and just leave it entirely up to the consumer
  • extract the comment related metadata items, and have, say,
    struct MetadataItems {
        // ...
        comment_one: Option<(String, Option<String>)>,
        comment_two: Option<(String, Option<String>)>,
        comment_three: Option<(String, Option<String>)>,
    }
  • or just go ahead and use the interface we expect the client to want, which is more like,
    struct MetadataItems {
        // ...
        line_comment: Option<String>,
        block_comment: Option<(String, String)>,
    }

This latter is the cleanest and the easiest to use but requires us to make some assumptions and to throw away some metadata. It feels like the best approach for now, but if you have an opinion I'm happy to hear it.

@keith-hall
Copy link
Collaborator

I personally vote for:

expose a shellVariables that includes all the keys in the metadata files, and just leave it entirely up to the consumer

as it is the most flexible/useful :)

@cmyr
Copy link
Contributor Author

cmyr commented Nov 6, 2018

Okay I've updated this to simplify the merging logic and address some of the other feedback.

I've noticed a probable bug here, as well: we go to pretty major effort to keep around the 'raw' metadata, so that we can correctly merge if new files are added; however we don't include the raw metadata when we do gendata, so adding new metadata to a syntax set that was loaded from a dump won't work. I'll resolve this in an upcoming commit.

@cmyr
Copy link
Contributor Author

cmyr commented Nov 6, 2018

Okay, an issue with the raw metadata stuff:

bincode doesn't support deserialize_any, which means we can't serialize serde_json::Value, so we can't really include the raw metadata in its current form in the pack files. (Including the raw metadata also more than doubles the size of the pack files.)

We could take this as an opportunity to simplify the overall approach; we could discard raw metadata when we build the SyntaxSet, and if the syntax set goes back to a builder we reset it. In this world adding more metadata after having generated a SyntaxSet would mean the new metadata wouldn't merge with the old; a scope that was present in the new metadata would overwrite anything that already existed.

This seems reasonable for any use-cases that I can imagine, and generally simplifies things; let me know if it makes sense to you?

@cmyr
Copy link
Contributor Author

cmyr commented Nov 6, 2018

I've gone ahead and added a commit that makes the change described above. Let me know if you prefer a different approach.

@@ -2,26 +2,40 @@
//! syntect, not as a helpful example for beginners.
//! Although it is a valid example for serializing syntaxes, you probably won't need
//! to do this yourself unless you want to cache your own compiled grammars.
//!
//! The standard command to generate the syntax dumps (including metadata) is:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you replace the command under make packs in the Makefile with this? Then maybe replace this comment with a reference to this script being used in the Makefile using make packs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

where F: FnMut(&MetadataItems) -> Option<T>
{
self.items.iter()
.map(|(_, meta_set)| &meta_set.items)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know that it's the case that Sublime does this trying multiple patterns thing? Seems rarely applicable and I wouldn't be surprised if Sublime only tries the matchiest, which would let you simplify a bunch of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe sublime does it this way; this is why we need the new default package, because it includes some base settings for the source and comment scopes. @keith-hall suggested this implementation in #183 or #179.

Copy link
Owner

@trishume trishume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two little things, sorry I missed these on previous passes.

@@ -3,7 +3,7 @@
//! settings like selection color, `ThemeSet` for loading themes,
//! as well as things starting with `Highlight` for how to highlight text.
mod selector;
mod settings;
pub(crate) mod settings;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously public and even though I think literally nobody uses it I'd rather not break semver compatibility for little reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was previously private, I exposed it because I needed StackSelectors.

Copy link
Contributor Author

@cmyr cmyr Nov 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

er, not StackSelectors but Settings and SettingsObject? because I'm loading plists.

edit: I can avoid this and just redeclare those typedefs locally, I do this already for SettingsObject.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops derp cool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looked at this again, it's because I'm using load_plist. Could reexport manually if that makes more sense?


/// Generates a `MetadataSet` from a single file
#[cfg(test)]
pub fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {
pub(crate) fn quick_load(path: &str) -> Result<MetadataSet, LoadingError> {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This adds initial support for loading metadata from `.tmPreferences`
files. The main focus of this patch is loading the patterns used to
determine indentation. I expect to follow up with support for comment
strings, with support for other features possible in the future.

This is all gated behind a new `metadata` feature. If enabled, a
top-level `Metadata` object will be included in `SyntaxSet`; this object
can be queried for the metadata available for a given scope.

This patch does not attempt to implement any actual indentation logic;
that is left up to the consumer.
This also includes a bit of PR feedback and cleanup.
@trishume
Copy link
Owner

Oh whoops derp one last request sorry I keep forgetting things: Can you modify travis.yml to pass --features metadata to cargo test so that CI tests the new code?

@cmyr
Copy link
Contributor Author

cmyr commented Nov 12, 2018

@trishume yep can do that now

@cmyr
Copy link
Contributor Author

cmyr commented Nov 12, 2018

Okay, should be good to go?

Copy link
Owner

@trishume trishume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this!

@trishume trishume merged commit c2e857f into trishume:master Nov 12, 2018
@cmyr cmyr deleted the feature/metadata branch November 12, 2018 02:16
@cmyr
Copy link
Contributor Author

cmyr commented Nov 12, 2018

Thanks for taking the time out of your Sunday to get it merged, much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants