Enhancement: print the source event, output data or event in JSON instead of plain text format #92

opk12 · 2022-08-31T11:51:30Z

--listen and --tail print events in a human-readable format. I want the json, to parse it easily in my script. Could matrix-commander print the json by default or with a command-line flag?

The text was updated successfully, but these errors were encountered:

opk12 · 2022-08-31T12:01:25Z

Local patch

On my test machine, I replaced line 2053

    print(complete_msg, flush=True)

with

    print(json.dumps(event.source))

This prints one line for each event. My script reads each line and parses with json.loads().

Does event.source always esist in all event types received from the callback function? I'm asking because the change works on my machine for my limited, interactive usage, but I'd like to be sure for automated usage and for sharing the patch with other people.

Note on newlines: json.dumps() defaults to indent=None which selects the most compact representation (in reference to newlines) according to the docs

opk12 · 2022-08-31T12:27:36Z

To parse the lines from matrix-commander I am using:

def get_latest_events(*, room_id: str, tail: int=100) -> list[dict]:
    """
    Requires this patch to matrix-commander::  
        https://github.com/8go/matrix-commander/issues/92#issuecomment-1232843897

    :return: Events from the latest to the oldest.
    """

    import collections, json, subprocess

    args = ['matrix-commander', '--tail', str(tail), '--listen-self', '--room', room_id]
    mc_stdout = subprocess.run(args, capture_output=True).stdout
    mc_lines = mc_stdout.decode('utf-8').splitlines()
    return [json.loads(line) for line in mc_lines]

HoboDev · 2022-09-16T07:47:14Z

I've made some very basic implementation which uses the flag --output-json
It's not yet incomplete but I was driven by the same idea wanting to have JSON output.

Maybe with a bit of work this can be turned into a PR

https://github.com/HoboDev/matrix-commander/tree/feat-output-json

opk12 · 2022-09-17T07:07:06Z

@HoboDev That's unrelated. This issue is for the data structure returned verbatim from the server. A Matrix room is a list of json objects that follow the Client-Server API specification and matrix-commander has the json inside event.source.

A client should use the Client-Server specification to parse matrix-commander output into anything.

Instead your code parses the standard json into a non-standard json with limited information. This is a maintainance nightmare in the long term, because it ties the downloader with the parser, the Client-Server spec is changing, and anyone will request their fields of interest.

@HoboDev Could you write a library to parse the Client-Server json (= wanted matrix-commander output) into a high-level data structure (list[Message] where each Message has text_revisions, reactions/annotations, in reply to, ...)? I don't know one for Python.

HoboDev · 2022-09-17T10:54:57Z

I've replaced the very same line you have changed to get json output and added some more code.
Looking at the keywords "json output" I fail to see how this isn't related. Aren't we talking about json output as the common denominator?

What I did is more like a rough draft or proof of concept since I didn't know which specification to follow and didn't have time for going into details. I don't recall that the code I've added parses anything that must be the existing code. I'm just using some variables to get the info and put it into a structure.
Of course this can always be changed to a standard. Which obviously is more sensible.

Unfortunately I've seen your issue after I was done with my draft. Would have been easier to start like this.

I guess I could write a library but neither am I well versed in python nor do I have the enough free time for digging into this.
Maybe as I tag along and more knowledge builds up the path will shorten.
For this endeavor we'd need the desired output from matrix-commander first anyway. Also it would make things a lot easier.
But currently I just need the json output and am looking for a sensible solution to implement that.

So what is your recommendation? Just outputting all json data as it arrives?
That shouldn't take too long and would be quite an easy pull request if it's accompanied by the right flag and documentation.
What about matrix-commanders info messages? I've rerouted the first line to stdErr. The same could be done for all info messages (if there are any, e.g. --debug messages).

HoboDev · 2022-09-17T11:00:46Z

Another thing just came through my mind (because I don't really know the spec and didn't have time to look at the resulting data):

Wouldn't it be more sensible to output the whole event?

Let's talk flags...
E.g.

--print-raw-event
--print-raw-event-source

Then we could have both. The spec would remain unchanged and the next program could sort it out.

8go · 2022-10-02T14:27:37Z

Hi @opk12 and @HoboDev

Thank you for your valuable input.

You make valid points. I also see that there are several issues, all somewhat related.

Let me think a bit about it, maybe the best is an incremental approach. Get the JSON event data published for listening, thereafter step-by-step add other event/JSON related output issues.

- bug fix in --joined-members - new feature: --output - --output is currently only implemented for 2 functions: --joined-members and --joined-rooms - see Issue #94 - see Issue #95 - see Issue #92 - see Issue #89

8go · 2022-10-04T19:20:06Z

I want to centralize all JSON-related, event-data, raw-data, output-format etc. discussion in one place, here in Issue #92.

Hence I closed Issue #95.

8go · 2022-10-04T19:21:54Z

@HoboDev , @opk12

Please check out the new release and the --output feature, currently implemented for --joined-members and --joined-rooms.

Test it and give feedback.

Is that what you expected?
Can it be improved?

8go · 2022-10-04T19:28:26Z

@HoboDev wrote this in Issue #94

I have started a basic implementation of --output-json to get JSON on stdout, so I don't have to deconstruct the message lines. You can check it out here: https://github.com/HoboDev/matrix-commander/tree/feat-output-json

It is lacking a bit on documentation and maybe does not adhere too much to existing variable names. But at least it's a start. So far I've covered basic messages and media files.

I moved it over here because I want all the discussion to be in the same place, here in Issue #92.

8go · 2022-10-04T19:35:55Z

A question I have: --joined-members "*" in text mode produces multiple lines as output, one line for each room.

In JSON, should that be

1 line with 1 JSON array (with N objects in it) or
N lines each one with 1 JSON object?

If you run m-c now with --joined-members "*" you will see that it returns 1 line with 1 JSON array (with N objects).

Is that the best?

Same question will come up once received messages are done. Should it output 1 line with 1 JSON object for each message read, or just a total of 1 line with a JSON array of all N messages?

What do you think? I was leaning towards just returning 1 single array object. The user can then use jq or something else to go thru the array.

opk12 · 2022-10-04T20:45:29Z

I'd follow the standard Unix practice of printing one object on one line.

For --joined-members, the object of interest is a member, and there is no pro in emitting JSON instead of @user:homeserver, unless you plan to add fields in the future and document that the returned JSON may have fields added. Edit: then a separate call can get the user info from the ID.
For messages, the object of interest is the event as returned from the homeserver.

The purpose of the Unix way is that a client can read an object as soon as it's downloaded. Otherwise

a client that only needs some objects, or decides to terminate sooner (error etc), must wait to download everything, and fails for a network failure while downloading unnecessary events
it is inefficient to parse a very huge string (many MB) then navigate through a very huge array. It can also be painful when the system is low on memory or on a low end device (think Mobian).
The standard Unix tools can't be applied, you must be a programmer (not just a power user) and learn jq, and not all systems have jq available. For example, with the patch above that prints one json per line, I am filtering the jsons with grep and head (no parsing) to get all the mxc:// URLs for a specific room.

The Unix way also has the nice property of SIGPIPE: matrix-commander | client is optimized by the OS, by terminating matrix-commander as soon as client terminates. So the download is stopped automatically as soon as client gets the necessary material. This requires that messages are sent to client as soon as they are available.

8go · 2022-10-04T20:51:38Z

@opk12 thanks for the input and feedback.

Anyone else, any more feedback?

8go · 2022-10-04T20:58:49Z

Option 1 , currently implemented, output is 1 line with 1 JSON array (with N objects in it)

matrix-commander --joined-members "\!rrr:xxx.sytes.net" "\!sss:xxx.xxx.xxx"  --output rAW | jq 
[
 {
   "members": [
     {
       "user_id": "@u1:xxx.xxx.xxx",
       "display_name": "u1",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     }
   ],
   "room_id": "!rrr:xxx.xxx.xxx"
 },
 {
   "members": [
     {
       "user_id": "@u1:xxx.xxx.xxx",
       "display_name": "u1",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     },
     {
       "user_id": "@u2:xxx.xxx.xxx",
       "display_name": "u2",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     }
   ],
   "room_id": "!sss:xxx.xxx.xxx"
 }
]

Option 2 , currently not implemented, output is N lines each one with 1 JSON object; preferred by @opk12 as explained above

matrix-commander --joined-members "\!rrr:xxx.sytes.net" "\!sss:xxx.xxx.xxx"  --output rAW | jq 
 {
   "members": [
     {
       "user_id": "@u1:xxx.xxx.xxx",
       "display_name": "u1",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     }
   ],
   "room_id": "!rrr:xxx.xxx.xxx"
 }
 {
   "members": [
     {
       "user_id": "@u1:xxx.xxx.xxx",
       "display_name": "u1",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     },
     {
       "user_id": "@u2:xxx.xxx.xxx",
       "display_name": "u2",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     }
   ],
   "room_id": "!sss:xxx.xxx.xxx"
 }

- new action workflow for Docker - changed `--output raw` to print 1 line PER data-item, i.e. N lines with 1 JSON object each (instead of 1 line in TOTAL with a JSON array on N JSON objects) - added code to cover --output for actions --listen and --tail - see Issue #92

8go · 2022-10-05T12:07:17Z

I changed the code to follow @opk12 advice. Now the Option 2 is implemented.

Now a separate line and a separate JSON object is printed for each data-set (e.g. message received, room-member found, etc.)

Test it with --listen, --tail or --joined-members and --joined-rooms.

E.g. the new code delivers this:

matrix-commander --joined-members "\!rrr:xxx.sytes.net" "\!sss:xxx.xxx.xxx"  --output raw | jq 
 {
   "members": [
     {
       "user_id": "@u1:xxx.xxx.xxx",
       "display_name": "u1",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     }
   ],
   "room_id": "!rrr:xxx.xxx.xxx"
 }
 {
   "members": [
     {
       "user_id": "@u1:xxx.xxx.xxx",
       "display_name": "u1",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     },
     {
       "user_id": "@u2:xxx.xxx.xxx",
       "display_name": "u2",
       "avatar_url": "mxc://xxx.xxx.xxx/zzz"
     }
   ],
   "room_id": "!sss:xxx.xxx.xxx"
 }

- upgrade code from matrix-nio v0.19 to 0.20 - added --output to more functions - renamed --output options from `human` to `text` - renamed --output options from `raw` to `json` - renamed --output options from `raw-details` to `json-max` - might be useful for Issue #89 - see Issue #92

8go · 2022-10-06T12:18:19Z

Have a look at the new release. --output now covers --listen and --tail.

Try something like --tail 1 --listen-self --output json-max | jq or --tail 1 --listen-self --output json | jq

@opk12
Your 20-line Python program (posted above) to read messages should work. I have not tested it, but please report back.

longregen · 2022-10-06T15:51:20Z

Wow, I was literally looking for how to do this and saw your release 4 hours ago. Thank you!

opk12 · 2022-10-07T08:25:53Z

@8go My use case is specifically about having spec-defined events:

when the spec changes, I should not wait for my program to break, report on GitHub, wait for m-c to implement, test the m-c update (pray without bugs), re-deploy in production
programmers should not try the option to learn what is output, or guess about what 'close to matrix-nio data' can mean
programmers need the guarantee that a future implementation does not change the output format

In other words, my use case is specifically about printing the source event unaltered, as returned verbatim from the Matrix homeserver without any kind of manipulation. It's not about collecting the fields and making a new dict, it's about treating the matrix-nio event object as opaque and passing it on as-is to the client.

opk12 · 2022-10-07T08:30:51Z

Another point is guaranteed interoperability with other libraries in the Matrix ecosystem, i.e.

it's not about having the fields in the output, or my personal trust in the m-c implementation quality, it's being able to prove the correctness of my client code to the future myself / dev team lead / ... by quoting the m-c manpage that says that the output follows the spec
if I copy a function from another FLOSS library into my software, it will manipulate spec events, and passing m-c output to it should work de jure (not as a de facto "coincidence")

To be clear, experience shows that, if one needs to touch the Matrix events because an easy interface is not sufficient, learning the spec is the best favor they can do to themselves in the long term.

8go · 2022-10-07T15:40:40Z

OK, @opk12 , your point is valid. But ...

Let's take an example: --get-room-info . --get-room-info
a) gets the room object from matrix-nio and
b) adds display_name obtained via matrix-nio API call display_name and
c) outputs the combined Json object (which includes the display_name)

Observation:
a) Does the matrix-nio give me a room object the follows the official Matrix spec? Hopefully? But I do not know for sure, I have not compared the matrix-nio code with the spec.
b) display_name is not in the matrix-nio object. There is name. There is group_name, etc. Calling the display_name method in the nio API, it computes the display_name based on name, groupname, etc. It seems convenient to have the display_name readily available without having to making another m-c call.

In summary,

there are a few instances where one or two fields are added (like display_name in the example above).
nothing is ever removed, except transport_response for json because transport_response is not not the data, it is about how the data was obtained and transported.

How would you improve that?

- added new option `--get-client-info` - `--output` has been added to all functions, all functions that have output can do the output in JSON now - see Issue #92

8go · 2022-10-07T15:51:35Z

The description now reads:

  --output OUTPUT       This option decides on how the output is presented. Currently offered choices are: 'text', 'json' and 'json-max'.
                        Provide one of these choices. The default is 'text'. If you want to use the default, then there is no need to use
                        this option. If you have chosen 'text', the output will be formatted with the intention to be consumed by humans,
                        i.e. readable text. If you have chosen 'json', the output will be formatted as JSON. The content of the JSON
                        object matches the data provided by the matrix-nio API. In some occassions the output is enhanced by having added
                        a few data items for convenience. These convenient data items are added to the data from matrix-nio. In most cases
                        the output will be processed by other programs rather than read by humans. Option 'json-max' is practically the
                        same as 'json', but yet another additional field has been added. The data item 'transport_response' which
                        gives information on how the data was obtained and transported is being added.

opk12 · 2022-10-07T20:16:45Z

Props for the big work and effort. I am following the releases and I see the pervasive code changes.

I personally need a room's events (--listen and --tail) for piping into a spec-compliant parser. This was lost in the thread above, sorry for not clarifying that I referred to the room events today.

This is just a 2-line change, Callbacks.message_callback() doing if gs.pa.output == '--speced-raw-event-from-homeserver': print(json.dumps(original event)) (without obj_to_dict and custom fields, to spare me a compatibility layer around m-c that depends on m-c implementation details).

8go · 2022-10-08T12:53:32Z

How about a --output json-spec that is equal to json (for the time being) except for the message callback.

?

8go · 2022-10-08T12:55:32Z

I kind of foresaw that some day something like this would happen (I didnt think it would happen so soon, hahaha), that is why it is not a binary on-off --json but a --output many-sub-options.

- implemented `--output json-spec` to sprint messages etc as spec, as provided by matrix-nio, without modifications - see Issue #92

8go · 2022-10-08T15:17:50Z

@opk12 try out --output json-spec

opk12 · 2022-10-08T20:20:36Z

Thank you! Here are some points.

Is there a reason now for json-max and json to be separate?
Shouldn't json-spec be defined in the help text in terms of the C-S spec? Non-programmer power users, or programmers who know Matrix but not Python, won't even try it if they read "matrix-nio".
json-spec does not really make sense for --get-room-info and other queries that are not for room events. It feels weird to write -spec and get stuff from matrix-nio (one of many libraries).

About -spec: Reading the commit, it feels like having an umbrella "matrix-nio stuff" option for conceptually different things (room events and matrix-nio objects) could be caused by my other feature request, of calling the m-c functions to avoid reimplementing the same snippets around matrix-nio. But it's really as simple as that (= function call) because serialization (= json.dump of matrix-nio objects) is contrary to the feature:

deserializing requires me more effort than calling or copy/pasting a snippet from matrix-commander
while the output is understandable by a human, it is unusable programmatically, because __dict__ shows the implementation-dependent private fields (so my app cannot rely on it) and because __dict__ won't output properties (= fields that are not stored in the object, but computed on every call)
proper de/serialization needs dedicated methods in matrix-nio (merely exchanging fields won't make it)

Maybe there is some use case that I am missing, but why printing matrix-nio objects at all in the -spec case.

Some find it convenient to shorten if a == A1 or a == A2 to the idiomatic if a in (A1, A2, A3)

8go · 2022-10-10T16:44:39Z

@opk12 Regarding:

Is there a reason now for json-max and json to be separate?

Only reason is that json-max also gives transport-response but fwe people will care about that

Shouldn't json-spec be defined in the help text in terms of the C-S spec? Non-programmer power users, or programmers who know Matrix but not Python, won't even try it if they read "matrix-nio".

Yes, It would make sense to define/document json-spec as "according to specification....".

json-spec does not really make sense for --get-room-info and other queries that are not for room events. It feels weird to write -spec and get stuff from matrix-nio (one of many libraries).

Are you suggesting that json-spec should only print the message events in --listen and --tail but be silent (print nothing) in all other calls (e.g. --get-room-info) ? The message event when receiving messages is the only "as-spec" event, correct? Or is there another one ?

4> Some find it convenient to shorten if a == A1 or a == A2 to the idiomatic if a in (A1, A2, A3)

Thanks for pointing that out, easy to fix.

Question: --tail 10 -y --output json-spec : This is working for you? Yes/No? Working as you expected?

- see Issue #92

opk12 · 2022-10-11T15:23:33Z

json-spec does not really make sense for --get-room-info and other queries that are not for room events. It feels weird to write -spec and get stuff from matrix-nio (one of many libraries).

Are you suggesting that json-spec should only print the message events in --listen and --tail but be silent (print nothing) in all other calls (e.g. --get-room-info) ?

I suggest that it raises an error: another option was meant (it is a typo), or, to silence output, the usual shell redirection > /dev/null would be more appropriate.

The message event when receiving messages is the only "as-spec" event, correct? Or is there another one ?

As far as I know (I studied room events), there are no other kinds of events (the spec says "event" and "room event" interchangeably) or other cases for --json-spec. Taking your example of --get-room-info: it merges information from different HTTP calls and I don't know a single Matrix object that represents a room. If the spec adds such an object in the future, it will be the perfect match for --json-spec, so giving an error now gives flexibility in the future (no backward compatibility to constrain the output).

Question: --tail 10 -y --output json-spec : This is working for you? Yes/No? Working as you expected?

Sure, it works and I migrated my room dump script!

Another code simplification is possible. Rather than making the output inside an if, all outputs can be made in advance. Adding other --outputs in the future will not make a lot of unmanageable nested ifs, but just one new line another_output = json_ | ...

For example, this

                if gs.pa.output in ...:
                    dic = resp.__dict__
                    if gs.pa.output == OUTPUT_JSON:
                        ...

can become (json_ has the underscore to avoid a name clash with the module json)

    json_ = {...}
    json_max = json_ | {'new key': ...}
    json_other = json_.copy()
    json_other.pop('key to remove')
    json_spec = ...
    print_event_depending_on_option(gs.pa.output, text=..., json_=json_, json_max=json_max, json_spec=json_spec)

    def print_event_depending_on_option(option: Literal['text', 'json', 'json-max', 'json-spec'], *, text: str, json_: dict=None, json_max: dict=None, json_spec: dict=None) -> None:
        results = {
            OUTPUT_TEXT: text,
            OUTPUT_JSON: json_,
            OUTPUT_JSON_MAX: json_max,
            OUTPUT_JSON_SPEC: json_spec,
        }
        print(results[gs.pa.output])

- internal changes - see Issue #92

8go changed the title ~~print the source event~~ Enhancement: print the source event, output data or event in JSON instead of plain text format Oct 2, 2022

8go mentioned this issue Oct 2, 2022

Question: How would I go about saving the output from --joined-rooms and/or --joined-members to json format? #95

Closed

8go added the enhancement New feature or request label Oct 4, 2022

8go mentioned this issue Oct 4, 2022

Enhancement: flag to enable JSON output to stdout (basic work started) #94

Closed

8go pushed a commit that referenced this issue Oct 7, 2022

added new option --get-client-info

a9c2d48

- added new option `--get-client-info` - `--output` has been added to all functions, all functions that have output can do the output in JSON now - see Issue #92

8go pushed a commit that referenced this issue Oct 8, 2022

added new type json-spec to --output

a2a3364

- implemented `--output json-spec` to sprint messages etc as spec, as provided by matrix-nio, without modifications - see Issue #92

8go pushed a commit that referenced this issue Oct 10, 2022

minor internal improvements, making if-s more pythonic

6eb5f8e

- see Issue #92

8go pushed a commit that referenced this issue Oct 11, 2022

changes to --output, json-spec only provides output for --listen

5b94b04

- internal changes - see Issue #92

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: print the source event, output data or event in JSON instead of plain text format #92

Enhancement: print the source event, output data or event in JSON instead of plain text format #92

opk12 commented Aug 31, 2022

opk12 commented Aug 31, 2022

opk12 commented Aug 31, 2022 •

edited

Loading

HoboDev commented Sep 16, 2022 •

edited

Loading

opk12 commented Sep 17, 2022 •

edited

Loading

HoboDev commented Sep 17, 2022 •

edited

Loading

HoboDev commented Sep 17, 2022 •

edited

Loading

8go commented Oct 2, 2022

8go commented Oct 4, 2022

8go commented Oct 4, 2022

8go commented Oct 4, 2022

8go commented Oct 4, 2022 •

edited

Loading

opk12 commented Oct 4, 2022 •

edited

Loading

8go commented Oct 4, 2022

8go commented Oct 4, 2022

8go commented Oct 5, 2022

8go commented Oct 6, 2022

longregen commented Oct 6, 2022

opk12 commented Oct 7, 2022 •

edited

Loading

opk12 commented Oct 7, 2022 •

edited

Loading

8go commented Oct 7, 2022

8go commented Oct 7, 2022

opk12 commented Oct 7, 2022 •

edited

Loading

8go commented Oct 8, 2022

8go commented Oct 8, 2022

8go commented Oct 8, 2022

opk12 commented Oct 8, 2022 •

edited

Loading

8go commented Oct 10, 2022 •

edited

Loading

opk12 commented Oct 11, 2022

Enhancement: print the source event, output data or event in JSON instead of plain text format #92

Enhancement: print the source event, output data or event in JSON instead of plain text format #92

Comments

opk12 commented Aug 31, 2022

opk12 commented Aug 31, 2022

Local patch

opk12 commented Aug 31, 2022 • edited Loading

HoboDev commented Sep 16, 2022 • edited Loading

opk12 commented Sep 17, 2022 • edited Loading

HoboDev commented Sep 17, 2022 • edited Loading

HoboDev commented Sep 17, 2022 • edited Loading

8go commented Oct 2, 2022

8go commented Oct 4, 2022

8go commented Oct 4, 2022

8go commented Oct 4, 2022

8go commented Oct 4, 2022 • edited Loading

opk12 commented Oct 4, 2022 • edited Loading

8go commented Oct 4, 2022

8go commented Oct 4, 2022

8go commented Oct 5, 2022

8go commented Oct 6, 2022

longregen commented Oct 6, 2022

opk12 commented Oct 7, 2022 • edited Loading

opk12 commented Oct 7, 2022 • edited Loading

8go commented Oct 7, 2022

8go commented Oct 7, 2022

opk12 commented Oct 7, 2022 • edited Loading

8go commented Oct 8, 2022

8go commented Oct 8, 2022

8go commented Oct 8, 2022

opk12 commented Oct 8, 2022 • edited Loading

8go commented Oct 10, 2022 • edited Loading

opk12 commented Oct 11, 2022

opk12 commented Aug 31, 2022 •

edited

Loading

HoboDev commented Sep 16, 2022 •

edited

Loading

opk12 commented Sep 17, 2022 •

edited

Loading

HoboDev commented Sep 17, 2022 •

edited

Loading

HoboDev commented Sep 17, 2022 •

edited

Loading

8go commented Oct 4, 2022 •

edited

Loading

opk12 commented Oct 4, 2022 •

edited

Loading

opk12 commented Oct 7, 2022 •

edited

Loading

opk12 commented Oct 7, 2022 •

edited

Loading

opk12 commented Oct 7, 2022 •

edited

Loading

opk12 commented Oct 8, 2022 •

edited

Loading

8go commented Oct 10, 2022 •

edited

Loading