Skip to content
This repository

Expose (version) information to user scripts #1452

Closed
arantius opened this Issue October 22, 2011 · 29 comments

5 participants

arantius nascentt Mike Medley Johan Sundström Ryan Chatham
arantius
Collaborator

There should be a way for existing "userland" update scripts to know that they're running in a version of Greasemonkey that does updates itself, and skip running. In order for this to be useful, they should also know whether updates are running.

The branch already merged from #1053 did something like this, but I didn't keep it in the merge because I wasn't totally comfortable with exactly the way it was implemented. I always meant to revisit. This issue is to track that this happens.

nascentt

I'd love this very much. Being able to have my own update method for users not yet using GM with autoupdating script would be amazing. But also having a way to be able to do an update if the sourceurl isn't yet know would be amazing. Something like:

if (gmVersion < 0.9.13) {
doMyUpdateCheck();
}
else {
if (!GMHasSource(myScript) {
doMyUpdateCheck();
}
}

arantius
Collaborator

I'm thinking there's a much larger overall hole to be filled. We can provide information about the platform (greasemonkey itself) and the script (its own metadata). The latter is especially useful for an @require, to know the info of whatever script it is included in. So e.g. the platform version would be in there, and the script's downloadURL (perhaps, exact details will need to be figured out).

Or if we're gonna do it, let's really do it.

Mike Medley

The branch already merged from #1053 did something like this, but I didn't keep it in the merge because I wasn't totally comfortable with exactly the way it was implemented.
Care to elaborate?
arantius
Collaborator

Mostly because it seemed a little haphazard, and I was hoping we could do something like this instead.

arantius
Collaborator

Spitballing what this might look like. In the sandbox:

const GM_info = {
  'version': '0.9.13',
  'scriptWillUpdate': true, // will Greasemonkey auto-update this script?
  'script': {
    'name': 'Example User Script',
    'include': [
      'http://www.example.com/*',
      'http://www.example.org/*',
    ],
    'exclude': [
      'http://www.example.com/login*',
    ]
    'version': '1.0',
    // ... and so on for all defined metadata ...
  },
  'scriptMetaStr': '// @name ...\n// @version ...\n.........',
};

The willUpdate value is effected by many things: do we know the URL (i.e. was it installed before/after 0.9.0), did it come from a local file, is this script enabled for updates, are updates globally enabled, etc.

What about the formatting? Should it be script.willUpdate and script.metadata.xxx? Something else?

What else does Greasemonkey know that a script might want?

nascentt

Like the look of the proposed functionality. willUpdate should be very useful.

Edit: Oops. So I guess modifying my previous example it'd be:

var numVersion = GM_info.version() * 1; //convert from string
if (numVersion < 0.9.13) {
doMyUpdateCheck();
}
else {
if (GM_info.WillUpdate() == false) {
doMyUpdateCheck();
}
}

Would that be about right? If so, everything looks good to me. I'm undecided on the metadata.xxx block I like the cleanliness of script.version, but if there's a lot of metadata, I guess the metadata block makes more sense from an organisational standpoint.

arantius
Collaborator

Using this will it be possible to get the greasemonkey version too?

Was the first line of data there in my proposed example above.

Johan Sundström
Collaborator

+1 on exposing GM's parsed view of the script's metadata to the script itself – that makes stuff more DRY. Good syntax.

What I'd add is (not necessarily as properties on GM_info – an API function accessor like GM_metainfo() may be better) a less schema-ful view of all script metainfo headers (where they all show up as arrays, one per header) – which, for this script, would look something like:

{ "name": ["Example User Script"]
, "include":
  [ "http://www.example.com/*"
  , "http://www.example.org/*"
  ]
, "exclude": ["http://www.example.com/login*"]
, "version": ["1.0"]
, ... other script headers GM might or might not support here ...
}

The use case would be to allow script and library authors to prototype useful GM headers, so we can present viable ways for people to showcase stuff in @require libraries that, if they turn out popular and well merited, might be considered for GM proper. We have had huge threads on that kind of thing on GM-dev for proposals like @xpath and other meta things, largely because we have no way to improve GM without changing its core today.

arantius
Collaborator

For that sort of usage, it would seem best to me to simply expose the string between the ==UserScript== bits, and let scripts deal with parsing it from there, in whatever way they elect to use it beyond the documented behavior.

Johan Sundström
Collaborator

If we API it we could easily support either, though I doubt the string version would get as much use. What's good about the two proposals above are their no-fuss webbiness for the common use case.

Mike Medley

and let scripts deal with parsing it from there
If the whole point of offering them this value is so they can parse information out of it, why not parse it for them?
Ryan Chatham

I see no point in getting the raw ==UserScript== text when I am going to parse it anyways. It should be parse automatically by default. Perhaps a solution like this which could satisfy those who need the raw text?

GM_metainfo(returnRaw) {
    if (returnRaw === true) {
        // Return raw ==UserScript== text
    }
    else {
        // Return parsed info
    }
}
arantius
Collaborator

Johan said "let's expose every parsed item!". I said "expose the string" (alongside the parsed, supported, items).

I see no point in getting the raw ==UserScript== text when I am going to parse it anyways.

Because you know how to parse it.

Because we can't possibly parse every possible type of item that anyone might ever make up in the future. A while back we would have just given name/value keypairs. Then we get things like @resource, which has multiple values (name and url). Some keys (e.g. @include) are legitimately repeated, and all values are significant. Others (e.g. @name) should not be repeated and only the last is significant.

To repeat: we can't know the right way to handle every possible thing anyone can make up. So we expose the info we already know (and use), structured. Then we give the whole string, so that those authors who might want to make up their own things can do so.

Johan Sundström
Collaborator

To date, every @header imperative we have created and incorporated has been a single-line property starting with // @ and followed by a series of non-whitespace characters that make up a name identifier, ending at the end of that line and ignoring any whitespace immediately after the identifier.

I find it highly likely that 90%+ of all future header imperatives will be too, and that most users will want exactly that, as well. Some may use multiple white-space-separated values on that line, like @require, many will not. That difference is minimal, compared to all the repeated effort of splitting by line, filtering out non-header lines, finding the lines out the lines with the right header name and removing the header, unintentionally not getting it quite as how GM does it in the process, and so on.

We could do as @cletusc suggests above if we really care for the "whole string" part (I personally don't, but I wouldn't fight supporting it, especially not if someone wants it), but the important part to me would be an api which, for a header like

// ==UserScript==
// @name            My extended script
// @namespace       https://someone.github.com/
// @description     Cool things
// @include         http://google.com/*
// @include         https://google.com/*
// @resource css    styles.css
// @require         lib.js
// @xpath img_links //a[@href][.//img[@src]]
// @unwrap
// ==/UserScript==

returns this easy-to-consume object:

{ "name": ["My extended script"]
, "namespace": ["https://someone.github.com/"]
, "description": ["Cool things"]
, "include":
  [ "http://google.com/*"
  , "https://google.com/*"
  ]
, "resource": ["css    styles.css"]
, "require": ["lib.js"]
, "xpath": ["img_links //a[@href][.//img[@src]]"]
, "unwrap": [null]
}

(null here being my proposal for what to do with headers present, but without later whitespace on the line).

Johan Sundström
Collaborator

Huh. I never noticed "scriptMetaStr" in your original proposal.

To me it feels a little out of place in the proposed GM_info constant, but would look somewhat at home in the GM_metainfo(/*returnRaw=*/true) "I really crave doing magic parsing myself" use case. (Which has more potential to muck up future user script headers than the proposed "half-parsed" mode does, which keeps user script headers in line with the format we have adopted.)

arantius
Collaborator

Huh. I never noticed "scriptMetaStr" in your original proposal.

I edited that in and, upon re-reading comment history, see that I forgot to mention it.

arantius
Collaborator

Seems like I'm in the minority. Can someone championing full-raw-parsed-data provide an entire sample GM_info object as a literal, like I did above? Does it still have both versions of the metadata? How are they organized?

Johan Sundström
Collaborator

That's what my example in #1452 (comment) was. It'd just be another const GM_headers = {...}; next to your const GM_info = {}; if we'd end up exposing both as raw constants in the script. And maybe const GM_raw_header = '// @name ...\n// @version ...\n.........' if we do it that way.

Mike Medley

@johan
I would parse the sample metadata you provided (modified a little) like:

// ==UserScript==
// @name                 My extended script
// @namespace            https://someone.github.com/
// @description          Cool things
// @include              http://google.com/*
// @include              https://google.com/*
// @resource css         styles.css
// @resource othercss    otherstyles.css
// @require              lib.js
// @xpath img_links      //a[@href][.//img[@src]]
// @unwrap
// ==/UserScript==
{
  "name": ["My", "extended", "script"],
  "namespace": ["https://someone.github.com/"],
  "description": ["Cool", "things"],
  "include": ["http://google.com/*", "https://google.com/*"],
  "resource": [
    ["css", "styles.css"],
    ["othercss", "otherstyles.css"]
  ],
  "require": ["lib.js"],
  "xpath": [
    ["img_links", "//a[@href][.//img[@src]]"]
  ],
  "unwrap": [null]
}

Basically, the text proceeded by the "@" sign becomes the key and any text that follows until a new line is broken up into an array with space as a delimiter. That array (unless there is only a single value on a line then just the string is used) is then put in the array for each key so that more than one value for each key can be saved (this is the same behavior as Johan's example). I'm not sure if I explained that well enough but hopefully the example above speaks for itself. This parse method works universally and no data is lost. To get the whole value of @name and @description in this example, the script writer would just use join(" ") on that array.

Johan Sundström
Collaborator

@sizzlemctwizzle Just to clarify, I am voting for having both my GM_headers and arantius' suggested GM_info for GM-parsed headers (which in all likelihood will be the most useful and frequently used of these concepts). That one should presumably do something fancy like your treatment of @resource above.

My championed GM_headers (under whichever name; I would probably prefer a shorter one like GM_head on second thought) can't do the kind of thing you are suggesting, since the whole point of it is not to presume any knowledge of what any header means. In your example you treat @description and @resource differently, which sounds like a future maintenance problem waiting to happen when we add some new @resource-like keyword @x and suddenly all scripts that were relying on GM_headers.x arrays being on the old format break.

Johan Sundström
Collaborator

Scripts consuming these half-parsed GM_headers can split up their input on /\s+/ with minimal effort if they need to, but if we do it for them, something like @xpath triple_ws //text()[contains(.,'   ')] would not work, as it would nuke it to ["triple_ws", "//text()[contains(.,'", "')]"] losing valuable information for an @xpath-eating library.

Hence "half-parsed", in my notes above. I'm looking for the best compromise between hackable and can-do-everything, and I believe "hackable" is the most important part.

Mike Medley

I see your point with the @xpath example. Guess that idea was a little half-baked. Yeah, I'd be fine with your half-parsed headers. Seems more flexible.

I believe "hackable" is the most important part.

Yes it is. We don't want to force constraints upon people if it isn't absolutely necessary.

Ryan Chatham

I am a +1 for @johan's last entry.

// ==UserScript==
// @name                 My extended script
// @namespace            https://someone.github.com/
// @description          Cool things
// @include              http://google.com/*
// @include              https://google.com/*
// @resource css         styles.css
// @resource othercss    otherstyles.css
// @require              lib.js
// @require              otherlib.js
// @xpath img_links      //a[@href][.//img[@src]]
// @unwrap
// @mykey                someValue
// @mykey                someOtherValue
// ==/UserScript==

would become

{
    'name': ['My extended script"],
    'namespace': ['https://someone.github.com/'],
    'description': ['Cool things'],
    'include': [
        'http://google.com/*',
        'https://google.com/*'
    ],
    'resource': [
        'css         styles.css',
        'othercss    otherstyles.css'
    ],
    'require': [
        'lib.js',
        'otherlib.js'
    ],
    'xpath': [
        'img_links      //a[@href][.//img[@src]]'
    ],
    'unwrap': [null],
    'mykey': [
        'someValue',
        'someOtherValue'
    ]
}

...where any user-defined keys would act like a @require or @include; populate the array with one metadata line per array entry, in the order that they appear. Key/value pairs that GM treats as strict should stay strict when getting it from this function/constant; only show the @name and @namespace that is used by GM.

If I decided to create a metadata entry that is delimited by spaces, I could easily parse just the keys I need and split by spaces. It is far easier parsing one key to meet my demands than it is parsing an entire block of text myself and possibly goofing.

Now whether or not this is used in a global GM_info (GM plus the script info) or an actual API GM_metainfo (just the script info) doesn't matter to me. Really, you could put it in GM_info and just have the API grab from it the same way I mentioned up above.

Johan Sundström
Collaborator

I think we present the same idea, except maybe this part:

Key/value pairs that GM treats as strict should stay strict when getting it from this function/constant; only show the @name and @namespace that is used by GM.

In the half-parsed GM_head proposal, name and namespace would be treated no different from anything else, now or in the future: if some script has twelve of each, those properties become twelve-item arrays. In the GM_info object, they are single strings, without any arrays wrapping them.

Both GM_head and GM_info should be simple and internally consistent views on the header data, and they target different use cases.

Ryan Chatham

In the half-parsed GM_head proposal, name and namespace would be treated no different from anything else, now or in the future: if some script has twelve of each, those properties become twelve-item arrays. In the GM_info object, they are single strings, without any arrays wrapping them.

Gotcha, I agree with you on doing it that way then. Just as long as there is some way to get what GM is using for the strict items.

arantius
Collaborator

Any code forthcoming for these proposals?

arantius arantius referenced this issue from a commit December 08, 2011
Commit has since been removed from the repository and is no longer available.
arantius
Collaborator

I've taken a very shallow stab at this. To note:

  • I left my proposed defined headers + raw string approach. At least for now.
  • I could use input where I left "???" markers.
  • Naming? Do we want keys of e.g. "run-at" like the "@" name, or "runAt" like a JS identifier?
Ryan Chatham

If the script contains a key, regardless of what it is, it should be shown in the "script" portion.
If a key doesn't exist (exclude, icon, etc.), it shouldn't show on the "script" portion.
Naming should be identical to the actual key; use "run-at" instead of "runAt".

arantius
Collaborator

Nobody's contributed code yet. So I'm going to split "structured" data out of this issue and resolve this with what I've already written, which for reference produces e.g.:

{
  "version":"0.9.15",
  "scriptWillUpdate":false,
  "script":{
    "description":"",
    "excludes":[],
    "includes":["http*"],
    "matches":[],
    "name":"GM_info test",
    "namespace":"http://github.com/arantius",
    "run-at":"document-end",
    "unwrap":false,
    "version":null
  },
  "scriptMetaStr":"// @name           GM_info test\n// @namespace      http://github.com/arantius\n// @include        http*\n"
}

Because it should be easy to add more in the future, but there isn't clear agreement yet what it should be or how it should work. As written (and this is easy to explain in documentation) these are the values Greasemonkey actually uses (minus user 'cludes -- should we add them?), in script. Another key could be for some sort of semi-parsed version of these. I think, however that A) we'd have to write new code to parse the scriptMetaStr and B) that should probably be written as a @require rather than GM core.

arantius arantius closed this in b63e754 February 06, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.