New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the AniDB utility (together with a test!) #248
Changes from 1 commit
750286e
496fd2e
62fe4f3
7925842
68c6da5
18c376e
72a3ff9
379a541
1ffc793
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
from django.core.management.base import BaseCommand, CommandError | ||
from mangaki.utils.anidb import AniDB | ||
from mangaki.models import Work, Category | ||
from django.db.models import Count | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unused. |
||
from urllib.parse import urlparse, parse_qs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we find an automated thing that does this job? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, theorically, autoflake should do the job. |
||
import sys | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unused. |
||
|
||
def create_anime(**kwargs): | ||
anime = Category.objects.get(slug='anime') | ||
if 'anidb_aid' in kwargs: | ||
return Work.objects.update_or_create(category=anime, anidb_aid=kwargs['anidb_aid'], defaults=kwargs)[0] | ||
else: | ||
return Work.objects.create(category=anime, **kwargs) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if we don't specify an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Destroying is not the correct word. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. Still, that does not answer the question, if |
||
|
||
class Command(BaseCommand): | ||
args = '' | ||
help = 'Retrieve AniDB data' | ||
|
||
def add_arguments(self, parser): | ||
parser.add_argument('id', type=int) | ||
|
||
def handle(self, *args, **options): | ||
if options.get('id'): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. C'est possible que ça soit false? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pas tellement. Sécurité inutile. |
||
print(options.get('id')) | ||
anidb = AniDB('mangakihttp', 1) | ||
anime = create_anime(**anidb.get(options.get('id'))) | ||
print(anime) | ||
anime.retrieve_poster() |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -141,6 +141,7 @@ def retrieve_poster(self, url=None, session=None): | |
return False | ||
|
||
filename = os.path.basename(urlparse(url).path) | ||
# Hé mais ça va pas écraser des posters / créer des collisions, ça ? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then, we need to handle name collisions.
If you encounter a collision, it means:
Generally, we should overwrite because we are lucky. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oui mais c'est l'objet d'une issue nice-to-have (vais créer). Et tu es pédant avec le paradoxe des anniversaires. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alright, then wouldn't that be better if you put the issue link in the comment so that anyone can access to discussion and context about the issue? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Follow-up: #251 |
||
|
||
try: | ||
r = session.get(url, timeout=5, stream=True) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
from django.test import TestCase | ||
from mangaki.models import Work, Category, Editor, Studio | ||
from mangaki.utils.anidb import AniDB | ||
|
||
|
||
class AniDBTest(TestCase): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd really prefer a We should be able to provide a fixture of test AniDB responses and feed them to the test case. Mocking + fixturing this test case will be interesting and set an example for the codebase. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can't mock. I will remove it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tests-chan :'( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Follow-up: #250 |
||
|
||
def create_anime(self, **kwargs): | ||
anime = Category.objects.get(slug='anime') | ||
return Work.objects.create(category=anime, **kwargs) | ||
|
||
def setUp(self): | ||
# FIXME: The defaults for editor and studio in Work requires those to | ||
# exist, or else foreign key constraints fail. | ||
Editor.objects.create(pk=1) | ||
Studio.objects.create(pk=1) | ||
self.anidb = AniDB('mangakihttp', 1) | ||
|
||
def test_anidb_search(self): | ||
results = self.anidb.search(q='sangatsu no lion') | ||
self.assertNotEqual(len(results), 0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Quite basic as a test, could we try to see if the result contain really something similar to |
||
|
||
def test_anidb_get(self): | ||
anime = self.create_anime(**self.anidb.get(11606)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It is composed of three distinct primes factors: 2, 7, 829. But, really, I am not convinced that we should rely on AniDB server-side IDs (e.g. though, we store them in our DB, agreed), I'd prefer if we could test random ID or rely on fixture responses rather than network responses. |
||
self.assertNotEqual(anime.title, '') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if I just created the anime: |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,12 @@ | ||
import requests | ||
from bs4 import BeautifulSoup | ||
|
||
from datetime import datetime | ||
|
||
|
||
BASE_URL = "http://api.anidb.net:9001/httpapi" | ||
SEARCH_URL = "http://anisearch.outrance.pl/" | ||
PROTOCOL_VERSION = 1 | ||
|
||
try: | ||
str = unicode | ||
except: | ||
pass | ||
# str = str | ||
|
||
class AniDB: | ||
def __init__(self, client_id, client_ver=0): | ||
|
@@ -45,12 +40,10 @@ def search(self, q): | |
results = [] | ||
animetitles = BeautifulSoup(r.text, 'xml').animetitles | ||
for anime in animetitles.find_all('anime'): | ||
results.append(Anime({ | ||
'id': int(anime['aid']), | ||
'title': str(anime.find('title', attrs={'type': "official"}).string), | ||
'picture': "http://img7.anidb.net/pics/anime/" + str(anime.find('picture').string), | ||
}, partial=True, updater=lambda: self.get(anime.id))) | ||
|
||
results.append( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It appears to me this is a good candidate to transform this function into a generator, if Otherwise, I'm wondering if it'd be more optimal to use a |
||
(int(anime['aid']), | ||
str(anime.find('title', attrs={'type': "official"}).string)) | ||
) | ||
return results | ||
|
||
def get(self, id): | ||
|
@@ -62,81 +55,30 @@ def get(self, id): | |
|
||
r = self._request("anime", {'aid': id}) | ||
soup = BeautifulSoup(r.text.encode('utf-8'), 'xml') # http://stackoverflow.com/questions/31126831/beautifulsoup-with-xml-fails-to-parse-full-unicode-strings#comment50430922_31146912 | ||
"""with open('backup.xml', 'w') as f: | ||
f.write(r.text)""" | ||
with open('backup.xml', 'w') as f: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you explain why do you create this file, which purpose does it serve? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was for debug, bien vu! |
||
f.write(r.text) | ||
if soup.error is not None: | ||
raise Exception(soup.error.string) | ||
|
||
anime = soup.anime | ||
titles = anime.titles | ||
|
||
a = Anime({ | ||
'id': id, | ||
'type': str(anime.type.string), | ||
'episodecount': int(anime.episodecount.string), | ||
'startdate': datetime(*list(map(int, anime.startdate.string.split("-")))), | ||
'enddate': datetime(*list(map(int, anime.enddate.string.split("-")))), | ||
'titles': [( | ||
str(title.string), | ||
title['type'] if 'type' in title else "unknown" | ||
) for title in anime.find_all('title')], | ||
'title': str(titles.find('title', attrs={'type': "main"}).string), | ||
'relatedanime': [], | ||
'url': str(anime.url.string) if anime.url else None, | ||
'creators': anime.creators, | ||
'description': str(anime.description.string), | ||
'ratings': SmartDict({ | ||
'permanent': float(anime.ratings.permanent.string), | ||
'temporary': float(anime.ratings.temporary.string), | ||
'review': float(anime.ratings.review.string) if anime.ratings.review else '' | ||
}), | ||
'picture': "http://img7.anidb.net/pics/anime/" + str(anime.picture.string), | ||
'categories': [], | ||
'tags': [], | ||
'characters': [], | ||
'episodes': [], | ||
}) | ||
|
||
self._cache[id] = a | ||
|
||
all_titles = anime.titles | ||
# creators = anime.creators | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a |
||
# episodes = anime.episodes | ||
# tags = anime.tags | ||
# characters = anime.characters | ||
# ratings = anime.ratings.{permanent, temporary} | ||
|
||
a = { | ||
'title': str(all_titles.find('title', attrs={'type': "main"}).string), | ||
'source': 'AniDB: ' + str(anime.url.string) if anime.url else None, | ||
'ext_poster': 'http://img7.anidb.net/pics/anime/' + str(anime.picture.string), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hotlink is hot. Could we use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are talking in 2.7. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm getting old. :'( |
||
# 'nsfw': ? | ||
'date': datetime(*list(map(int, anime.startdate.string.split("-")))), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we extract this magic method into a utility function which would be put at the top of this file? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't remember where it comes from but yes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As I recall, it comes from the original library. |
||
# not yet in model: 'enddate': datetime(*list(map(int, anime.enddate.string.split("-")))), | ||
'synopsis': str(anime.description.string), | ||
# 'artists': ? from anime.creators | ||
'nb_episodes': int(anime.episodecount.string), | ||
'anime_type': str(anime.type.string), | ||
'anidb_aid': id | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is generally a bad idea to name a variable (answer: because it's supposed to be a built-in Python function.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. Now please merge! |
||
} | ||
return a | ||
|
||
class SmartDict(dict): | ||
def __init__(self, *a, **kw): | ||
super(SmartDict, self).__init__(**kw) | ||
for x in a: | ||
try: | ||
self.update(x) | ||
except TypeError: | ||
self.update(x.__dict__) | ||
self.__dict__ = self | ||
|
||
class Anime: | ||
def __init__(self, data={}, partial=False, **kw): | ||
self._data = data | ||
self._partial = partial | ||
if partial: | ||
self._updater = kw['updater'] | ||
|
||
def _update(self): | ||
if not self._partial: | ||
raise Exception("Attempted to update a ready object") | ||
else: | ||
self._data = self._updater()._data | ||
self._partial = False | ||
|
||
def __getattr__(self, name): | ||
if name in self._data: | ||
return self._data[name] | ||
else: | ||
if self._partial: | ||
self._update() | ||
if name in self._data: | ||
return self._data[name] | ||
else: | ||
raise AttributeError("no attribute called '%s'" % name) | ||
else: | ||
raise AttributeError("no attribute called '%s'" % name) | ||
|
||
def __repr__(self): | ||
return u'<Anime %i "%s">' % (self.id, self.title) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CommandError
is unused.