Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to get full revisions #3

Closed
lucaswerkmeister opened this issue May 22, 2022 · 6 comments
Closed

Add function to get full revisions #3

lucaswerkmeister opened this issue May 22, 2022 · 6 comments

Comments

@lucaswerkmeister
Copy link
Owner

I thought a queryFullRevisions() function, analogous to queryFullPages() (yield all the revisions in a continued response, e.g. from a generator), would make sense:

async function * queryFullRevisions(
    session,
    params,
    options = {},
) {
    params = makeParamsWithString( 'prop', params, 'revisions' );
    options = {
        dropTruncatedResultWarning: true,
        ...options,
    };
    for await ( const response of session.requestAndContinue( params, options ) ) {
        const query = response.query || {};
        let pages = query.pages || [];
        if ( !Array.isArray( pages ) ) {
            pages = Object.values( pages );
        }
        for ( const page of pages ) {
            const revisions = page.revisions || [];
            for ( const revision of revisions ) {
                yield revision;
            }
        }
    }
}

But I just realized that this leaves the caller with no way to know which page the revision belongs to. With the default rvprop, a revision looks like this:

{
    "revid": 1089180047,
    "parentid": 1089175962,
    "minor": false,
    "user": "50.107.154.111",
    "anon": true,
    "timestamp": "2022-05-22T10:10:46Z",
    "comment": "His Interpretation(s)"
}

There’s no indication here of the surrounding page, nor in any of the other available props (which makes sense in the context of a full API response).

This needs some more thinking about what the interface should look like. The session, params, options parameters should probably be the same, and it should return a generator, but I’m not sure what kind of objects the generator should yield.

@lucaswerkmeister
Copy link
Owner Author

Another thought I had was that there should maybe be an optional sortRevisions parameter, to determine the order in which the function yields revisions from the same response (e.g. by timestamp). Otherwise, that order will be pretty much arbitrary, without an easy way for the caller to work around it (because the caller doesn’t know when the next request starts). Perhaps that applies to queryFullPages() too (sortPages)… not sure.

lucaswerkmeister added a commit that referenced this issue May 22, 2022
Needs tests. Should also go in the README, but in a separate commit,
since PageByPageId and RevisionByRevisionId should also be in there.

I also realized that the interface needs more consideration, see #3.
@lucaswerkmeister
Copy link
Owner Author

One option would be to attach the outer page object as a member of the revision object – with a symbol as the key, and probably as a non-enumerable property. The symbol would be an anonymous one (i.e. not using Symbol.for()), exported by us. (I would probably make the property configurable and writable, to give users the freedom to change it if they really want to.) This has the advantage of being compatible with queryFullRevisionByRevisionId as well.

I’m not sure if I would want to add the page object to the original revision object that we got from the response (potentially shared with other combined requests, and also leading to a circular reference revision → page → revision → page), or if the returned revision would be something like { ...revision, [ pageSymbol ]: page }, i.e. a shallow copy so that the nested revision inside the page doesn’t have the symbol.

@lucaswerkmeister
Copy link
Owner Author

I’m not sure if I would want to add the page object to the original revision object that we got from the response (potentially shared with other combined requests, and also leading to a circular reference revision → page → revision → page), or if the returned revision would be something like { ...revision, [ pageSymbol ]: page }, i.e. a shallow copy so that the nested revision inside the page doesn’t have the symbol.

Alternatively, the page we attach could be a shallow copy from which we’ve removed the revisions key, to avoid returning that information twice…

@lucaswerkmeister
Copy link
Owner Author

I think at the moment I’m leaning towards doing both: attach, to a shallow copy of the revision, a shallow copy of the page without revisions. Note that destructuring assignment makes it very easy to get that shallow copy of the page:

let { revisions, ...remainingPage } = page;
if ( revisions === undefined ) {
    revisions = [];
}

lucaswerkmeister added a commit that referenced this issue May 28, 2022
Introduce a symbol which we use to decorate a revision object, returned
by the various revision functions, with its surrounding page. This is
expected to be useful in some situations where you want information
about the page as well – especially in the upcoming queryFullRevisions
(see #3), but also in the existing functions.

The symbol is not enumerable, so that it doesn’t pollute e.g.
console.log() output and generally doesn’t disturb anyone who’s not
interested in it; but it is writable and configurable, so that users who
want to change something about it can do so.

In the tests, a bunch of .equal() assertions need to change to .eql(),
since the returned object is no longer the same instance as the input
revision. For practical usage outside of tests, this should be fine.
(You could even argue that all the assertions, including for the “page”
functions, should use .eql() to start with, I think.)
lucaswerkmeister added a commit that referenced this issue May 28, 2022
Introduce a symbol which we use to decorate a revision object, returned
by the various revision functions, with its surrounding page. This is
expected to be useful in some situations where you want information
about the page as well – especially in the upcoming queryFullRevisions
(see #3), but also in the existing functions.

The symbol is not enumerable, so that it doesn’t pollute e.g.
console.log() output and generally doesn’t disturb anyone who’s not
interested in it; but it is writable and configurable, so that users who
want to change something about it can do so.

In the tests, a bunch of .equal() assertions need to change to .eql(),
since the returned object is no longer the same instance as the input
revision. For practical usage outside of tests, this should be fine.
(You could even argue that all the assertions, including for the “page”
functions, should use .eql() to start with, I think.)
@lucaswerkmeister
Copy link
Owner Author

The function should probably also yield missing revisions (if it was called with revids specifying revision IDs that don’t exist), I think. (With no surrounding page in that case, of course.)

lucaswerkmeister added a commit that referenced this issue May 29, 2022
This function is analogous to queryFullPages(), and should address the
potential pitfall of using prop=revisions with the “full pages”
functions (which would try to assemble all revisions of the page). It’ll
receive some further improvements, but this should be good enough as an
initial version.

Part of #3.
lucaswerkmeister added a commit that referenced this issue May 29, 2022
I assume this can only happen when you use revids instead of a
generator, but still, it seems better to yield them (with the missing
key added if needed, so extract the missingRevision() helper function)
than to silently drop them.

Part of #3.
lucaswerkmeister added a commit that referenced this issue May 29, 2022
I assume this can only happen when you use revids instead of a
generator, but still, it seems better to yield them (with the missing
key added if needed, so extract the missingRevision() helper function)
than to silently drop them.

Part of #3.
@lucaswerkmeister
Copy link
Owner Author

I think we can close this with d133d1b and 25ff01d; #4 exists for the sortRevisions parameter, and the README will be updated in general (some other important functions are also missing at the moment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant