Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incomplete title for tv show "the last of us" #739

Open
gargolito opened this issue Jan 16, 2023 · 4 comments
Open

Incomplete title for tv show "the last of us" #739

gargolito opened this issue Jan 16, 2023 · 4 comments

Comments

@gargolito
Copy link

gargolito commented Jan 16, 2023

guessit The.Last.of.Us.S01E01
For: The.Last.of.Us.S01E01

GuessIt found: {
    "title": "The Last of",
    "country": "UNITED STATES",
    "season": 1,
    "episode": 1,
    "type": "episode"
}
@Toilal
Copy link
Member

Toilal commented Feb 18, 2023

This issue has been discussed many times with other shows/movies ending with us word.

I'll let the issue open though as I also want this to be fixed. Maybe we could consider US as a country only if it's uppercase, when it's not bounded by other matches.

@gargolito
Copy link
Author

I have another weird thing. There was an episode of Ted Lasso filename had this in the name with season, episode and number: S03E03.4.5.1 guessit thought it was a multi-episode. I can see why. I haven't looked at your code yet but are you using any kind of NLP like spacy? That might help with calculating distance between words.

guessit Ted.Lasso.S03E03.4-5-1.mkv
For: Ted.Lasso.S03E03.4-5-1.mkv
GuessIt found: {
    "title": "Ted Lasso",
    "season": 3,
    "episode": [
        3,
        4,
        5
    ],
    "episode_title": "1",
    "screen_size": "1080p",
    "streaming_service": "AppleTV",
    "source": "Web",
    "audio_codec": "Dolby Digital Plus",
    "audio_channels": "5.1",
    "video_codec": "H.264",
    "release_group": "NTb",
    "container": "mkv",
    "mimetype": "video/x-matroska",
    "type": "episode"
}
`

@Toilal
Copy link
Member

Toilal commented Apr 12, 2023

I haven't looked at your code yet but are you using any kind of NLP like spacy?

No, it's just a big bunch of regexp and rules to solve conflicts between matches. I'm pretty sure some IA based algorithm could perform nicely for parsing movies/series filenames, but guessit is not based on any of those.

@VeNoMouS
Copy link

VeNoMouS commented Sep 12, 2023

This issue has been discussed many times with other shows/movies ending with us word.

I'll let the issue open though as I also want this to be fixed. Maybe we could consider US as a country only if it's uppercase, when it's not bounded by other matches.

Hi again @Toilal , are you going to implement this? i'm in favor of this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants