Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App gets movie details from internet #8

Closed
caloni opened this issue Feb 5, 2023 · 7 comments
Closed

App gets movie details from internet #8

caloni opened this issue Feb 5, 2023 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@caloni
Copy link
Owner

caloni commented Feb 5, 2023

Situation

We need to type every detail of the movie when registering a new DVD.

Objective

  • Use default values to commons cases (e.g. 1 disk, DVD format);
  • When typing the name of the product show a list of movie matches from the internet; if chosen one, fill the movie title and director fields (probaly Áudio as well).

Orientation

We can use the IMDB and its REST API or Google. The advantage of Google is that is possible to type the brazilian title of the movie and still find it (not always available in IMDB). Other options include OMDB. We can take advantage of Python lib and export this in a endpoint inside the API (look at this script).

Proof

It is possible to type a word in the product title field and at least one movie should be floating; if selected at least the movie title and movie director fields should be filled automatically.

@caloni caloni added the enhancement New feature or request label Feb 6, 2023
@caloni caloni self-assigned this Feb 6, 2023
@caloni
Copy link
Owner Author

caloni commented Feb 7, 2023

Movie search

Added support to movie search using IMDBPy (now Moviegoer) thru python endpoint, but there are two catches:

  • It is painly slow to capture several results (each movie needs to request its own page);
  • The ptbr title is hard to get.

For the first issue I am thinking about to return only title and year, and in the selection to capture entire information.

For the second issue I am thinking about to read the source code to understand what I am doing wrong searching for akas titles.

@caloni caloni linked a pull request Feb 7, 2023 that will close this issue
@caloni caloni removed a link to a pull request Feb 7, 2023
@caloni
Copy link
Owner Author

caloni commented Feb 7, 2023

Painly slow

It is painly slow to capture several results (each movie needs to request its own page).

Solved. Now only the titles are filled with this simple search. The drawback is only the name of the director, but that is not too much a problem than to fill the other zillion fields.

The next step is to default fill the other fields =)

@caloni
Copy link
Owner Author

caloni commented Feb 8, 2023

Default values

✔️

The next step and final one, change the title to Portuguese BR, I am not sure if is priority or desirable yet. To be decided.

@caloni
Copy link
Owner Author

caloni commented Feb 8, 2023

Director name

The director name was obtained using the same async search, but instead of using a title was used the movie id. This way, after selecting one of the available titles the search is made again to get more details of the title selected using its id.

The next step is to see how to get the ptbr title.

@caloni
Copy link
Owner Author

caloni commented Feb 14, 2023

PTBR

I am stuck to get the Brazilian Portuguese title in the 'akas' property from a movie. Analyzing why the Cinemagoer is not working I stumble upon this issue saying that the last version can have some issues because there was some big update in the IMDB website.

I tried to test the latest version and the test failed the same way when I tried to import the helpers module:

platform win32 -- Python 3.9.11, pytest-7.2.1, pluggy-1.0.0
rootdir: C:\Users\caloni\projects\cinemagoer
plugins: anyio-3.6.2
collected 375 items / 1 error

===================================================================================== ERRORS ======================================================================================
______________________________________________________________________ ERROR collecting tests/test_locale.py ______________________________________________________________________
tests\test_locale.py:6: in <module>
    import imdb.locale
..\..\AppData\Local\Programs\Python\Python39\lib\site-packages\imdb\locale\__init__.py:28: in <module>
    translation = gettext.translation('imdbpy', LOCALE_DIR)
..\..\AppData\Local\Programs\Python\Python39\lib\gettext.py:592: in translation
    raise FileNotFoundError(ENOENT,
E   FileNotFoundError: [Errno 2] No translation file found for domain: 'imdbpy'
============================================================================= short test summary info =============================================================================
ERROR tests/test_locale.py - FileNotFoundError: [Errno 2] No translation file found for domain: 'imdbpy'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================================================ 1 error in 0.75s =================================================================================

My next step is to study about locales in Python, because this seems to happen only on my machine.

@caloni
Copy link
Owner Author

caloni commented Feb 18, 2023

Import helpers solved

What I thought to be the problem was solved with an environment variable that doesn't exist in Windows about locales.

image

With that the import worked as well, but the akas list is still the same.

image

I really don't know what am I doing wrong or is this a bug in the new version for the IMDB site or a bug in the lib and has to be worked on before we can continue to use the full akas list. Anyway, I'm kind of tired of this approach. Perhaps a better strategy is to make a dumb search elsewhere (google, duckduckgo) and see what happens.

@caloni
Copy link
Owner Author

caloni commented Feb 19, 2023

Won't do

This issue has been dragged around the ptbr title with no easy result. In the meantime it was made a research about text recognition in the DVD image with promising results:

google_vision_image_annotate.json

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "https://raw.githubusercontent.com/Caloni/cadastro-dvds-venda/issue-8-movie-search/dvd.jpg"
        }
      },
      "features": [
        {
          "type": "TEXT_DETECTION"
        }
      ]
    }
  ]
}
curl -H "Accept: application/json" -H "Content-Type: application/json" -d "@google_vision_image_annotate.json" "https://vision.googleapis.com/v1/images:annotate?key=<YOUR_API_KEY>" > result.txt

result.txt (partial)

{
  "responses": [
    {
      "textAnnotations": [
        {
          "locale": "pt",
          "description": "All\nINDICADO AO OSCAR 2005- MELHOR ATRIZ\nCATALINA SANDINO MORENO\nDVD\nVIDEO\nExcelente!\nREV VEJA\nSurpreendente e autêntico.\nTHE NEW YORK TIMES\nMARIA\nCHEIA DE\nGRAÇA\nMARIA FULL OF GRACE\nBaseado em milhares de histórias reais.\nUrso de Prata de\nMELHOR ATRIZ\nBERLIN INTERNATIONAL\nBERL\nFILM FESTIVAL\nPrémio do Júri de\nMELHOR FILME\nSUNDANCE FILM FESTIVAL\nVENCEDOR\nMELHOR FILME\n28 MOSTRA SÃO PAULO DE CINEMA\nPrémio do Júri\nImagem\nFilme",
          "boundingPoly": {
            "vertices": [
              {
                "x": 30,
                ...
          }
        }
      ],
      "fullTextAnnotation": {
        "pages": [
          {
            "property": {
              "detectedLanguages": [
                {
                  "languageCode": "pt",
                  "confidence": 0.664754
                },
                {
                  "languageCode": "en",
                  "confidence": 0.22185819
                },
                {
                  "languageCode": "es",
                  "confidence": 0.06023947
                }
              ]
            },
            "width": 720,
            "height": 1280,
            "blocks": [
              {
                "boundingBox": {
                  "vertices": [
                    {
                      "x": 537,
                      "y": 23
            ]
            ...
          }
        ],
        "text": "All\nINDICADO AO OSCAR 2005- MELHOR ATRIZ\nCATALINA SANDINO MORENO\nDVD\nVIDEO\nExcelente!\nREV VEJA\nSurpreendente e autêntico.\nTHE NEW YORK TIMES\nMARIA\nCHEIA DE\nGRAÇA\nMARIA FULL OF GRACE\nBaseado em milhares de histórias reais.\nUrso de Prata de\nMELHOR ATRIZ\nBERLIN INTERNATIONAL\nBERL\nFILM FESTIVAL\nPrémio do Júri de\nMELHOR FILME\nSUNDANCE FILM FESTIVAL\nVENCEDOR\nMELHOR FILME\n28 MOSTRA SÃO PAULO DE CINEMA\nPrémio do Júri\nImagem\nFilme"
      }
    }
  ]
}

With the results above a more promising direction to this project is to invest some time to use this API and let the original API and original efforts using imdb API to an older and incomplete version.

@caloni caloni closed this as not planned Won't fix, can't repro, duplicate, stale Feb 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

1 participant