Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortcuts to tracks values #95

Closed
JulienPalard opened this issue Jul 2, 2020 · 9 comments
Closed

Shortcuts to tracks values #95

JulienPalard opened this issue Jul 2, 2020 · 9 comments

Comments

@JulienPalard
Copy link
Contributor

JulienPalard commented Jul 2, 2020

I'm working in a project that currently only handle jpg, so I always have only two tracks: General and Image, and at the moment I only need width and height, so I have to do something like:

info = MediaInfo.parse("file.jpg")
print(info.tracks[1].width, info.tracks[1].height)

And cross my fingers General and Image never get swapped in certain specific contexts.

Or I could do:

info = MediaInfo.parse("file.jpg")
image = [track for track in info.tracks if track.track_type == "Image"][0]
print(image.width, image.height)

Which is still a bit verbose.

What about:

info = MediaInfo.parse("file.jpg")
print(info.width, info.height)

It could be easily implemented by adding to the MediaInfo class:

def __getattr__(self, name):
    for track in self.tracks:
        value = getattr(track, name)
        if value is not None:
            return value
    raise AttributeError  # or return None

which would allow things like:

info = MediaInfo.parse(sys.argv[1])
print(f"{info.format} {info.width}×{info.height}")

to work with images and videos.

Alternatives

im_track = ImageMediaInfo.parse("file.jpg")  # Which raises if the result has no Image track
print(im_track.width, im_track.height)

or providing access to tracks by name (I know there can be multiple tracks with the same name):

im_track = MediaInfo.parse("file.jpg").image  # Which raises AttributeError if result has no Image track
print(im_track.width, im_track.height)

To discuss

Currently a Track returns None on non-existing attribute, should we mirror this here?

@sbraz
Copy link
Owner

sbraz commented Jul 2, 2020

What about:

I'm afraid we can't do that. There are a lot attributes that are present in multiple tracks.

or providing access to tracks by name (I know there can be multiple tracks with the same name):

I'm thinking of something like mediainfo.image_tracks, mediainfo.video_tracks, etc. It looks like all track types are defined here.

Would that work for you? If you're working with one image, you would only have to do mediainfo.image_tracks[0]

@JeromeMartinez
Copy link

There are a lot attributes that are present in multiple tracks.

If I understand well, it would take the first instance present (e.g. Width of video if there is both video and image).
Why not, but the API user must be clear about side effects when the input is not the expected one e.g. video instead of image.

I'm thinking of something like mediainfo.image_tracks, mediainfo.video_tracks, etc.

this is actually similar to the upstream API :-p, except that it is more dynamic (likemediainfo.tracks[image][0]), so it makes sense.

It looks like all track types are defined here.

It may be expanded in the future... But very rare, not expected at short term.

@sbraz
Copy link
Owner

sbraz commented Jul 2, 2020

Hi Jérôme, I'm always amazed at how quickly you appear in pymediainfo issues, do you get a notification whenever MediaInfoLib is mentioned 😛?

the API user must be clear about side effects when the input is not the expected one

I don't really like that, maybe if more users request that feature I will consider it but at the moment, I feel weird about exposing those unrelated attributes at the top level.

It may be expanded in the future... But very rare, not expected at short term.

OK, good to know, I guess this list hasn't changed for several years.

@sbraz
Copy link
Owner

sbraz commented Jul 2, 2020

this is actually similar to the upstream API :-p, except that it is more dynamic (likemediainfo.tracks[image][0]), so it makes sense.

I forgot to answer that: maybe I could also do it this way without breaking compatibility:

mediainfo.tracks[0] # index is an int, returns the first track
mediainfo.tracks["image"] # index is a str, returns Image tracks

@JulienPalard what do you think?

@JeromeMartinez
Copy link

JeromeMartinez commented Jul 2, 2020

I feel weird about exposing those unrelated attributes at the top level.

Same. I also got sch request (as well as not having an array for "general" part as it is always 1) but I currently prefer to have a coherent API rather than a simple one.

But as you (in pymediainfo) list all tracks in sequence, a mediainfo.image_tracks or mediainfo.tracks["image"] stuff makes sense.

I'm always amazed at how quickly you appear in pymediainfo issues, do you get a notification whenever MediaInfoLib is mentioned 😛?

"watch the project", "be notified of all conversations" GitHub feature 😛.

@sbraz
Copy link
Owner

sbraz commented Jul 2, 2020

But as you (in pymediainfo) list all tracks in sequence, a mediainfo.image_tracks of `mediainfo.tracks["image"] stuff makes sense.

Do you mean "instead of"?

"watch the project", "be notified of all conversations" GitHub feature stuck_out_tongue.

Ah cool, I wondered whether you were watching pymediainfo or just getting notifications for your own project.

@JeromeMartinez
Copy link

Do you mean "instead of"?

"or" (I edited my comment).

@JulienPalard
Copy link
Contributor Author

JulienPalard commented Jul 2, 2020

Thanks @JeromeMartinez for hopping in and @sbraz for the fast replies :)

It feel weired too, to me, to expose all values in the root scope and mix them (that's why I opened a discussion instead of a PR), and I had not searched enough to get the list of all attributes all tracks can have to estimate the number of collisions, and their importance.

You seem to agree that those collisions could be surprising to users (leading to unexpected values, potentially leading to bugs), to add this feature, I trust you.

About info.tracks["image"] vs info.image_tracks, I prefer the info.image_tracks, I'm not a fan of mixing tracks and list of tracks with the same access syntax.

this is actually similar to the upstream API :-p

Which is nice, for the (niche?) case of someone already knowing it and migrating to Python. (I was unable to find the doc though, can you link it to us?)

The info.image_tracks attribute would allow to write:

info = MediaInfo.parse("file.jpg")
print(info.image_tracks[0].width, info.image_tracks[0].height)

which is more reliable than the .tracks[1] one, and more concise than the list comprehension one, it's a win.

To help me see the difference, I'm trying to see what would change with the documentation example:

from pprint import pprint
from pymediainfo import MediaInfo
media_info = MediaInfo.parse("my_video_file.mp4")
for track in media_info.tracks:
    if track.track_type == "Video":
        print("Bit rate: {t.bit_rate}, Frame rate: {t.frame_rate}, "
              "Format: {t.format}".format(t=track)
        )
        print("Duration (raw value):", track.duration)
        print("Duration (other values:")
        pprint(track.other_duration)
    elif track.track_type == "Audio":
        print("Track data:")
        pprint(track.to_data())

could be rewritten as:

from pprint import pprint
from pymediainfo import MediaInfo
media_info = MediaInfo.parse("my_video_file.mp4")
for track in media_info.video_tracks:
    print("Bit rate: {t.bit_rate}, Frame rate: {t.frame_rate}, "
          "Format: {t.format}".format(t=track)
    )
    print("Duration (raw value):", track.duration)
    print("Duration (other values:")
    pprint(track.other_duration)
for track in media_info.audio_tracks:
    print("Track data:")
    pprint(track.to_data())

Are you happy with this? Looks good to me, and I like the side effect of dropping an indentation level.

Would asking for info.image_track, or even info.image to return info.image_tracks[0] be "a bit too much"? Maybe.

At least, forcing pymediainfo users to write [0] make it explicit that they are missing other potential tracks, so let's not go this way. Explicit is better.

I'm searching a bit on github to see if actual uses of pymediainfo (mine might not be representative) could benefit from it:

So yes, it looks like it would be usefull.

Seeing people using .tracks[0] to get the general track make me wonder: is this possible to have multiple General tracks? If not, would it make sense to give access to general attribute directly? I won't need it personally and seen a single example which would benefit from it (didn't searched a lot). If the general track is the only one to never have duplicates, it would be strange to provide .general_tracks.

Bonne journée.

@sbraz
Copy link
Owner

sbraz commented Nov 13, 2020

@JeromeMartinez any idea where I could find a very small sample that contains an Other type track? So far the best I found is https://trac.ffmpeg.org/attachment/ticket/746/mxfregression.mxf

I could also test from a saved XML file if that's not possible.

sbraz added a commit that referenced this issue Nov 13, 2020
In order to improve tests, chapters are added to sample.mkv and an image
is added to test files.
sbraz added a commit that referenced this issue Nov 13, 2020
In order to improve tests, chapters are added to sample.mkv and an image
is added to test files.
sbraz added a commit that referenced this issue Nov 13, 2020
In order to improve tests, chapters are added to sample.mkv and an image
is added to test files.
sbraz added a commit that referenced this issue Nov 18, 2020
Also add the following to improve tests:
* Chapters to sample.mkv.
* An image test file.
* A MediaInfo XML output for a file containing tracks of type "Other".
@sbraz sbraz closed this as completed in 12ad13e Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants