New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the AniDB utility (together with a test!) #248
Conversation
parser.add_argument('id', type=int) | ||
|
||
def handle(self, *args, **options): | ||
if options.get('id'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C'est possible que ça soit false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pas tellement. Sécurité inutile.
mangaki/mangaki/utils/anidb.py
Outdated
@@ -62,81 +55,30 @@ def get(self, id): | |||
|
|||
r = self._request("anime", {'aid': id}) | |||
soup = BeautifulSoup(r.text.encode('utf-8'), 'xml') # http://stackoverflow.com/questions/31126831/beautifulsoup-with-xml-fails-to-parse-full-unicode-strings#comment50430922_31146912 | |||
"""with open('backup.xml', 'w') as f: | |||
f.write(r.text)""" | |||
with open('backup.xml', 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why do you create this file, which purpose does it serve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was for debug, bien vu!
mangaki/mangaki/utils/anidb.py
Outdated
# 'artists': ? from anime.creators | ||
'nb_episodes': int(anime.episodecount.string), | ||
'anime_type': str(anime.type.string), | ||
'anidb_aid': id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is generally a bad idea to name a variable id
, time for riddles: why?
(answer: because it's supposed to be a built-in Python function.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Now please merge!
Codecov Report@@ Coverage Diff @@
## master #248 +/- ##
=======================================
Coverage 56.51% 56.51%
=======================================
Files 13 13
Lines 706 706
=======================================
Hits 399 399
Misses 307 307
Continue to review full report at Codecov.
|
def handle(self, *args, **options): | ||
anidb = AniDB('mangakihttp', 1) | ||
anime = create_anime(**anidb.get(options.get('id'))) | ||
anime.retrieve_poster() # Save for future use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure earlier about the print statements - but maybe printing the anime would make sense here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but should use the proper mechanisms of printing for mgt command in Django, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I even added syntax coloring! Thanks for the tip. https://docs.djangoproject.com/en/1.10/howto/custom-management-commands/#module-django.core.management
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arrêtez de me faire parler anglais. C'est pédant. Mais bon, ça s'ouvre à plus de gens donc OK.
mangaki/mangaki/utils/anidb.py
Outdated
""" | ||
Allows retrieval of non-file or episode related information for a specific anime by AID (AniDB anime id). | ||
http://wiki.anidb.net/w/HTTP_API_Definition#Anime | ||
""" | ||
id = int(id) # why? | ||
anidb_aid = int(anidb_aid) # why? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why => probably used with a query param, which would be a string?
Concernant le build qui fail, ça vient de : C'est parce que lorsqu'on utilise BeautifulSoup en mode décodage de XML, il faut un parseur comme |
@jilljenn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary:
-
I really prefer that test case does not interact with the network, e.g. AniDB. (not spamming them each time our CI runs tests will be nice).
-
Minor clean, e.g. unused modules, nitpicking on URL joining, tests behaviors, reusability.
After this, 🚀 !
from mangaki.utils.anidb import AniDB | ||
from mangaki.models import Work, Category | ||
from django.db.models import Count | ||
from urllib.parse import urlparse, parse_qs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parse_qs
is unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we find an automated thing that does this job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, theorically, autoflake should do the job.
from django.core.management.base import BaseCommand, CommandError | ||
from mangaki.utils.anidb import AniDB | ||
from mangaki.models import Work, Category | ||
from django.db.models import Count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
from mangaki.models import Work, Category | ||
from django.db.models import Count | ||
from urllib.parse import urlparse, parse_qs | ||
import sys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
@@ -0,0 +1,25 @@ | |||
from django.core.management.base import BaseCommand, CommandError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CommandError
is unused.
if 'anidb_aid' in kwargs: | ||
return Work.objects.update_or_create(category=anime, anidb_aid=kwargs['anidb_aid'], defaults=kwargs)[0] | ||
else: | ||
return Work.objects.create(category=anime, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we don't specify an anidb_aid
and we create a duplicate by mistake?
Is there any escape hatch to prevent this behavior which is destroying our DB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Destroying is not the correct word.
Actually, for this function only, the if is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Still, that does not answer the question, if anidb_aid
is not specified, can we create a duplicate? (due to no constraints on the DB side or the model manager).
mangaki/mangaki/utils/anidb.py
Outdated
""" | ||
Allows retrieval of non-file or episode related information for a specific anime by AID (AniDB anime id). | ||
http://wiki.anidb.net/w/HTTP_API_Definition#Anime | ||
""" | ||
id = int(id) # why? | ||
anidb_aid = int(anidb_aid) # why? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure that the anidb_aid
given is a number / integer. Must have been put to interface with strange code which would call this function using string, it's not necessary, we could just assume that anidb_aid
is of type int using type hinting.
'picture': "http://img7.anidb.net/pics/anime/" + str(anime.find('picture').string), | ||
}, partial=True, updater=lambda: self.get(anime.id))) | ||
|
||
results.append( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears to me this is a good candidate to transform this function into a generator, if animetitles.find_all
is also a generator, of course.
Otherwise, I'm wondering if it'd be more optimal to use a dict
to store animes by IDs, anyway, this is okay.
mangaki/mangaki/utils/anidb.py
Outdated
def __repr__(self): | ||
return u'<Anime %i "%s">' % (self.id, self.title) | ||
all_titles = anime.titles | ||
# creators = anime.creators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO
marker or FIXME
to let someone catch it while running ag 'TODO' mangaki/**
when bored.
mangaki/mangaki/utils/anidb.py
Outdated
anime_dict = { | ||
'title': str(all_titles.find('title', attrs={'type': "main"}).string), | ||
'source': 'AniDB: ' + str(anime.url.string) if anime.url else None, | ||
'ext_poster': 'http://img7.anidb.net/pics/anime/' + str(anime.picture.string), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hotlink is hot. Could we use urlparse.urljoin
rather than concatenating manually? (just for extra safety!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are talking in 2.7. from urllib.parse import urljoin
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting old. :'(
mangaki/mangaki/utils/anidb.py
Outdated
'source': 'AniDB: ' + str(anime.url.string) if anime.url else None, | ||
'ext_poster': 'http://img7.anidb.net/pics/anime/' + str(anime.picture.string), | ||
# 'nsfw': ? | ||
'date': datetime(*list(map(int, anime.startdate.string.split("-")))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extract this magic method into a utility function which would be put at the top of this file?
So that we can easily reuse this datetime parsing logic for all fields which may require it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember where it comes from but yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I recall, it comes from the original library.
mangaki/mangaki/utils/anidb.py
Outdated
# characters = anime.characters | ||
# ratings = anime.ratings.{permanent, temporary} | ||
|
||
print(urljoin('http://img7.anidb.net/pics/anime/', str(anime.picture.string))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debugging prints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
except: | ||
pass | ||
# str = str | ||
def to_python_datetime(mal_date): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good candidate for doctesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doctesting, not only docs 😄 ! (except if GitHub didn't reload the code.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it count for codecov? I think it's not that useful, but seems like a good thing to learn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It counts for codecov
(theorically.) !
* Rewrite the AniDB utility (together with a test!) * Remove useless lines * Rename variables * Add style * Remove useless code * Remove unit tests for AniDB * Improve style * Add doctesting
* vagrant: update and repair the Vagrantfile * vagrant: transfer the current user's keys into the machine * readme: update instructions * vagrant(bootstrap): install virtualenv as a user rather than globally * vagrant(size): warning about the size taken by the installation * Store posters in a FileField (#235) * Rename Work.poster to Work.ext_poster * Be less aggressive with local poster handling - Do not remove the external poster URL when downloading it locally; it is still a piece of possibly valuable information (especially since it is the only way we currently have to link an anime do MAL!) - If there is no locally available poster, fall back to the external URL. * Make admin action for refreshing posters actually work * Ignore media/ directory for development environments * Add FileField for posters * Clean up poster retrieval Various fixes for poster retrieval: - `retrieve_poster` now uses the requests library and is moved to a method on the Work class. It downloads the known external poster by default. - The admin action for updating posters no longer tries to re-download an existing poster onto itself. * Add management command to bulk download posters * Make retrieve_poster update the external poster URL * Update seed data to replace Work.poster with Work.ext_poster * Merge migrations * Address PR comments * Fix path and README (#244) * Add line to admin and change news (#245) Update news * script(bootstrap): remove the comment which got in the ALLOWED_HOSTS array… * Code coverage is now browsable on the codecov website (#252) * coverage has to run into the proper folder and nosetests to know about where are the tests * circle: fix path to manage.py for the coverage run * Rewrite the AniDB utility (together with a test!) (#248) * Rewrite the AniDB utility (together with a test!) * Remove useless lines * Rename variables * Add style * Remove useless code * Remove unit tests for AniDB * Improve style * Add doctesting
* Rewrite the AniDB utility (together with a test!) * Remove useless lines * Rename variables * Add style * Remove useless code * Remove unit tests for AniDB * Improve style * Add doctesting
* vagrant: update and repair the Vagrantfile * vagrant: transfer the current user's keys into the machine * readme: update instructions * vagrant(bootstrap): install virtualenv as a user rather than globally * vagrant(size): warning about the size taken by the installation * Store posters in a FileField (#235) * Rename Work.poster to Work.ext_poster * Be less aggressive with local poster handling - Do not remove the external poster URL when downloading it locally; it is still a piece of possibly valuable information (especially since it is the only way we currently have to link an anime do MAL!) - If there is no locally available poster, fall back to the external URL. * Make admin action for refreshing posters actually work * Ignore media/ directory for development environments * Add FileField for posters * Clean up poster retrieval Various fixes for poster retrieval: - `retrieve_poster` now uses the requests library and is moved to a method on the Work class. It downloads the known external poster by default. - The admin action for updating posters no longer tries to re-download an existing poster onto itself. * Add management command to bulk download posters * Make retrieve_poster update the external poster URL * Update seed data to replace Work.poster with Work.ext_poster * Merge migrations * Address PR comments * Fix path and README (#244) * Add line to admin and change news (#245) Update news * script(bootstrap): remove the comment which got in the ALLOWED_HOSTS array… * Code coverage is now browsable on the codecov website (#252) * coverage has to run into the proper folder and nosetests to know about where are the tests * circle: fix path to manage.py for the coverage run * Rewrite the AniDB utility (together with a test!) (#248) * Rewrite the AniDB utility (together with a test!) * Remove useless lines * Rename variables * Add style * Remove useless code * Remove unit tests for AniDB * Improve style * Add doctesting
* This is a commit. * Upper and latter corrected * Written tests for mangaki * Should be okay now. * Fixed stuff: DRY principle applied. URL to work applied. All correct handlers created. * Added 500 base error view in case of DatabaseError. * Looks better that way. * management_commands: add a sketch of generate seed data command for every purpose * seed data generation: use the temp database More verbose output to understand how the process is moving Fix the argument parsing * experiment: generate seed data w/o database cloning * Make tests pass, and actually run them on CircleCI. (#223) Merging with approval from @RaitoBezarius. * Do not require discourse settings for avatar initialization (#229) * Clean up signal handling (#231) * Clean up signal handling - We no longer use a post_save signal for updating scores when a Suggestion is changed; rather this has its place directly in the Suggestion.save method. - Signals are connected in the ready() method of Mangaki's new application configuration class, as recommended by Django's documentation: https://docs.djangoproject.com/en/dev/topics/signals/#connecting-receiver-functions Note that we use a receivers module instead of a signals module to allow for the creation and import of custom signals without registering handlers if the need ever arises. - Profile creation is no longer tied to login with django-allauth but rather to the actual User model creation. This helps handling corner cases such as accounts created through the `manage.py createsuperuser` management command actually having a profile. * Update tests for automatic profile creation * Address PR comments * circle: get back to the project root folder after tests (#233) * Add tests for searching works (#232) * Upgrade Mangaki to Django 1.10 (#234) * Lint the last commit in the CI (#226) * Disable git lint until we can configure it properly * Add some tests ensuring views are not crashing (#237) * Fix typo (unreviewed) * train_test_split moved in sklearn 0.18 (#240) Fixes #239 * Move Mangaki into the root folder (#227) * The `mangaki` folder content has been moved to the root of the repository. * After pulling this commit, many files will be reorganized, backup your work tree before pulling. * Pay attention to your `settings.ini` and path-dependant code, though, they should not be affected by this change. * unreviewed: fix spacing in README * Revert "unreviewed: fix spacing in README" This reverts commit f9848a9. * Revert "Move Mangaki into the root folder (#227)" This reverts commit 74d8749. This breaks mangaki. * Store posters in a FileField (#235) * Rename Work.poster to Work.ext_poster * Be less aggressive with local poster handling - Do not remove the external poster URL when downloading it locally; it is still a piece of possibly valuable information (especially since it is the only way we currently have to link an anime do MAL!) - If there is no locally available poster, fall back to the external URL. * Make admin action for refreshing posters actually work * Ignore media/ directory for development environments * Add FileField for posters * Clean up poster retrieval Various fixes for poster retrieval: - `retrieve_poster` now uses the requests library and is moved to a method on the Work class. It downloads the known external poster by default. - The admin action for updating posters no longer tries to re-download an existing poster onto itself. * Add management command to bulk download posters * Make retrieve_poster update the external poster URL * Update seed data to replace Work.poster with Work.ext_poster * Merge migrations * Address PR comments * Fix path and README (#244) * Add line to admin and change news (#245) Update news * Code coverage is now browsable on the codecov website (#252) * coverage has to run into the proper folder and nosetests to know about where are the tests * circle: fix path to manage.py for the coverage run * Rewrite the AniDB utility (together with a test!) (#248) * Rewrite the AniDB utility (together with a test!) * Remove useless lines * Rename variables * Add style * Remove useless code * Remove unit tests for AniDB * Improve style * Add doctesting * Update and repair the Vagrantfile (#243) * vagrant: update and repair the Vagrantfile * vagrant: transfer the current user's keys into the machine * readme: update instructions * vagrant(bootstrap): install virtualenv as a user rather than globally * vagrant(size): warning about the size taken by the installation * Store posters in a FileField (#235) * Rename Work.poster to Work.ext_poster * Be less aggressive with local poster handling - Do not remove the external poster URL when downloading it locally; it is still a piece of possibly valuable information (especially since it is the only way we currently have to link an anime do MAL!) - If there is no locally available poster, fall back to the external URL. * Make admin action for refreshing posters actually work * Ignore media/ directory for development environments * Add FileField for posters * Clean up poster retrieval Various fixes for poster retrieval: - `retrieve_poster` now uses the requests library and is moved to a method on the Work class. It downloads the known external poster by default. - The admin action for updating posters no longer tries to re-download an existing poster onto itself. * Add management command to bulk download posters * Make retrieve_poster update the external poster URL * Update seed data to replace Work.poster with Work.ext_poster * Merge migrations * Address PR comments * Fix path and README (#244) * Add line to admin and change news (#245) Update news * script(bootstrap): remove the comment which got in the ALLOWED_HOSTS array… * Code coverage is now browsable on the codecov website (#252) * coverage has to run into the proper folder and nosetests to know about where are the tests * circle: fix path to manage.py for the coverage run * Rewrite the AniDB utility (together with a test!) (#248) * Rewrite the AniDB utility (together with a test!) * Remove useless lines * Rename variables * Add style * Remove useless code * Remove unit tests for AniDB * Improve style * Add doctesting * Add new WALS algorithm from TensorFlow (#246) * Add new WALS algorithm from TensorFlow * Add WALS file * Improve style * Minor cleanup around the codebase (#253) * Fix syntax error in `reco_list.html` Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Mutable arguments are dangerous Default to None, if it's none, replace them by empty arrays. Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * {decode,encode}string are deprecated It's {decode,encode}bytes now. Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Reference local variable `now` properly before the loop Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Time to import time for the `retrieveposters` mgt command Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Reference the `nb_ratings` variable in the good scope. Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Remove unused imports from `zero.py` Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Remove unused imports Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Remove unused imports (import missing models for knn.py also) Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Import missing modules for NMF (otherwise, I don't see how it was working…) Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Requirements refactoring into folders Add matplotlib as requirement Signed-off-by: Raito Bezarius <masterancpp@gmail.com> * Remove more unused imports and add used imports * requirements: add production * cleanup: edit the README about requirements, remove old README in mangaki/ * readme: typo * hotfix: s/requirements-dev.txt/requirements/dev.txt (#258) * Ansible deployment for production (#171) * Rough ansible provisioning * ansible(roles): add initial roles * ansible(templates): add settings * ansible(vars): add mangaki * ansible(playbook): add the playbook * ansible(gitignore): ignore the hosts inventory and secrets vars * ansible(readme): explain about secrets * WIP 2 * Ansible: WIPv3 All remaining is unattended upgrades and email backend configuration. — Make Let's Encrypt renewal and setup works. — Add timers for ranking / top director. — Run Mangaki app server with supervisord. — Install lxml / numpy package to speed the pip install. * ansible: remove old mangaki.yml vars * ansible(email): add external smtp server support * Address PR comments * ansible: Refactor LE and NGINX into a same role * ansible: improve apt cache management and remove useless steps * ansible(letsencrypt): Force-restart NGINX after its installation, pin dehydrated's version * ansible(readme): make it more useful * ansible(nginx): remove escape character ' from main.yml * Remove useless notebooks (still ongoing) (#257) * Remove useless notebooks (still ongoing) * Cleanup notebooks * Notebook sur des exemples et tests de svd et dpp (#195) * création du premier notebook * Début du notebook sur le graphique * modifications mineures deu notebook, tests persos * chngments mineurs : mise en forme * continuation du notebook, recopiage du début d'une fonction compute_similarity_cosine * suppression de vieux fichiers * suite du notebook, 1ère fonction pour avoir une matrice de similarité * ajout d'une fonction coisine bis moins calculatoire, début des tests de la DPP * avancement du notebook (dpp, distance, comparaison), un peu (bcp) fouilli, ne pas lire * notebook _notebook stage_ : meilleure implémentation de cosine, début de jaccard, notebook _essai_ : essais, début dimaètre d'ordre r * un plus de tests, celui proposé par jj dans notebook _notebook stage_ et _essai, test_, début de la classe (juste une ébauche), création de matrice creuse pour utiliser directement ratings.csv et ne pas être obligé de faire tourner svd tout le temps * classe pour la dpp dans dpp.py et nouveau notebook _DPP_ pour tester * Add file for requirements for algorithms * Update notebooks with annotations * Modification de dpp.py Ajouts et modifications de classes et de fonctions nécessaires à la vérification, au test (comparaison) et à l'implémentation liés au dpp (determinantal point process) * Modifications de dpp.py Changements mineurs, ajout de jaccard (fonction toute faite), code plus lisible avec moins d'erreurs Utlisation directe de la BDD non encore faite mais bientôt en cours * Modifications de dpp.py et ajouts de notebooks brouillons Récupération des éléments depuis la base de données et non plus depuis ratings.csv Notebooks brouillons "Test classe" et "Test classe-Copy1" où des tests ont été faits. On peut y voir qu'un gros problème subsiste : des éléments sont ou deviennent des "nan" lors de certaines opérations * Modification de la future classe SimilarityMatrix de dpp.py Aucun changement notoire dans dpp.py Création de "test de Similarity[...]" pour créer, tester et modifier la classe Similarity. En cours : création de la matrice liée aux données brutes, en limitant les appels et les users et/ou les oeuvres n'ayant aucun rating * nouveau * Modifications de dpp.py Continuation et presque finition de dpp.py : la classe SimilarityMatrix a été en grande partie refaite. Les classes suivantes, à savoir MangakiUniform et MangakiDPP, et la fonction compare ont été modifiées en conséquence * Modifications de dpp.py et modification du notebook de test "test de similarity_matrix qui sera ds dpp.py.ipynb Suppression d'une fonction inutile dans dpp.py Modification du notebook : un test de dpp.py est fait vers la fin (voir le gros titre/heading "Test "final" de dpp.py" * Modifications de dpp.py Modifications suite aux remarques d'elarnon, dont principalement : -nouveau constructeur de SimilarityMatrix -code plus proche des recommandations PEP8 (utilisation de flake8 en vérification) -utilisation d'une matrice creuse et non plus d'une matrice qu'on remplit de zéros -utilisation de la fonction np.random.choice à la place de random.shuffle Il reste à créer la fonction compare voulue par elarnon notée compare2 en attendant * Modifications de dpp.py Modifications surtout de compare2 (qui remplacera compare) : -changement du diamètre d'ordre 1 pour le diamètre d'ordre 0 -changement du synospsis de la fonction, des arguments A faire : la tester, dans tous les cas * Save * Checkpoint * Dernières modifications dees fichiers liés aux algorithmes liés à la dpp : -dpp.py -buildmatrix.py qui contient une classe construisant principalement une matrice des ratings( users en ligne, works/items en colonne, ratings dans les cellules) à partir de la base de données ou d'un fichier csv Modification faire si buildmatrix.py est gardé, enlever la classe BuildMatrix dans dpp.py * Add pandas to requirements * Remove useless file * Determinantal Point Processes (#201) * implement dpp in mangaki * Dernière version de dpp.py et buildmatrix.py A l'air de bien marcher et vérifie la PEP8 * Modification de dpp.py Changements des noms de variable dans la fonction diameter_0 (variables sans accent et en anglais * Modification de buildmatrix.py Suppression d'une erreur d'inattention : un "rating" s'était incrusté à la place d'un "choice" ... * Intégration de dpp au site Commencement * avancement * Pas grand chose de nouveau : il faudrait prendre en conséquence le fait que l'on peut choisir d'avoir que des mangas ou que des animes avec la dpp * Avancement de l'intégration de dpp * problème d'url et vue pr savoir si l'on doit avoir le mode dpp ou pas * quelques changements * quelques changements * anciens fichiers * essai * Intégration de dpp au site, avec les recommandations cette fois Améliorable (fonctions presque "doubles" se ressemblant pr la version sans dpp et avec dpp * petites modifications mais non encore fini * master * rectifications en cours * changements * encore des rectifications (non fini encore) * pb : affichage anime/mangas seulement ne marche pas * rectifications * oublis * suppression d'une migration inutile (déjà présente ds master en fait) * retrait des recommandations pour dpp pour un utilisateur lambda * début rectificatifs * Rectifications PB avec la popularité normalement réglé (mais pas vraiment testé car j'ai 20 oeuvres en tout, c'est tout ^^). Si testé depuis une grosse seed :: augmenter le nombre d'oeuvres prises en compte ds popular (dpp ds models.py) et le nbre de points du sample de dpp (dans views.py) PB url avec les sort et les keywords dpp encore à faire * améliorations * essais * changements, encore des trucs à faires (les urls et voir dernières remarques elarnon) * amélioration des urls * L'histoire d'url pr dpp est réglé Modification des titres "DPP" en "Découvrir" * dernières (?) rectifications suite aux messages d'elarnon sauf erreur. Il faut encore cleaner le code, vérifier si PEP8 est bien respectée * PEP8 mieux respectée * rectifications * dernière version master * Changement des dépendances des migrations pour que ça marche (management command marche et la loaddata de la seed aussi (mais pas la big seed :/)) * Suppression d'un notebook inutile et n'ayant rien à faire là * rectifications, encore à vérifier en testant sur la version dev de mangaki * Rectifications suite aux remarques de raito * Variables temporaires supprimées dans ratingsmatrix.py car inutiles * typographie * new message error more accurate * Remove heavy files, add migrations * Fix tests * Fix test * Clean code * Add migration for tropes * Add request to server_error, remove handler500 from urls
Fixes #220.
random_ip
method does not work anymore on the myAnimeList API :D