Skip to content
This repository has been archived by the owner on Dec 17, 2018. It is now read-only.

准备提交一个pr,添加蜜柑计划做为数据来源 #74

Closed
trim21 opened this issue Jul 6, 2017 · 14 comments
Closed

准备提交一个pr,添加蜜柑计划做为数据来源 #74

trim21 opened this issue Jul 6, 2017 · 14 comments

Comments

@trim21
Copy link
Contributor

trim21 commented Jul 6, 2017

昨天在添加恋爱禁止的世界的时候,实际抓回来的是捏造陷阱NTR.
最主要的是到现在也没有new game.bangumi.moe那边的数据准确性好像有点低.似乎是自动识别加tag的

蜜柑计划 http://mikanani.me/的数据准确度比较高,本身就做好了番剧和字幕组的区分.
准备提交个pr,加一个数据来源,也从那边抓数据过来.

没有new game看我要死了

@trim21 trim21 changed the title 考虑提交一个pr,添加蜜柑计划做为数据来源 准备提交一个pr,添加蜜柑计划做为数据来源 Jul 6, 2017
@RicterZ
Copy link
Owner

RicterZ commented Jul 6, 2017 via email

@trim21
Copy link
Contributor Author

trim21 commented Jul 6, 2017

我昨天本来想提issue的,然后发现其实是上游数据的问题。所以准备自己动手添加数据源。这个issue主要是想问一下你是否介意,以及在完成之后是愿意合并。以及如果愿意的话有没有什么实现方法上介意的地方,比如介意添加依赖之类的(

@RicterZ
Copy link
Owner

RicterZ commented Jul 6, 2017 via email

@w3eee
Copy link

w3eee commented Jul 8, 2017

搭车提个疑问 订阅的是怎么把番组和对应的种子文件对应起来的 仅仅是名称的比对么?
但是有一些种子的命名不规范怎么办

@RicterZ
Copy link
Owner

RicterZ commented Jul 8, 2017 via email

@trim21 trim21 closed this as completed Aug 19, 2017
@trim21
Copy link
Contributor Author

trim21 commented Aug 23, 2017

重写了fetch.py
把从数据源获取数据抽象成了三个方法

class BangumiMoe(BaseWebsite):
    cover_url=''
    def search_by_keyword(self, keyword, count):
        return []

    def fetch_bangumi_calendar_and_subtitle_group(self):
        return [], []

    def fetch_episode_of_bangumi(self, bangumi_id, subtitle_list=None, max_page=MAX_PAGE):
        return []

如果要修改数据源的话重写这三个方法就可以了..
使用过程中不能更换数据源.

改动有些大,感觉好像跟script.py的作用部分重叠了....

@trim21 trim21 reopened this Aug 23, 2017
@RicterZ
Copy link
Owner

RicterZ commented Aug 25, 2017

emm,bgmi script 我打算添加一个自定义 model 的功能,还在构思。
目前的想法是你的蜜柑可以作为一个 api,script 可以传入参数调用就能获取结果这种就很方便了..

from xx import get_bangumi
class Script(xx):
    ...
    def get_bangumi_data(x):
         return get_bangumi(x)

之类的..

@trim21
Copy link
Contributor Author

trim21 commented Aug 25, 2017

之前改改改把fetch.py 最后改成了这样..好像跟你的想法差不多?
在配置项里加入了WEBSITE_NAME 默认为bangumi_moe

# coding=utf-8
from __future__ import print_function, unicode_literals

from bgmi.config import WEBSITE_NAME

from bgmi.website.bangumimoe import BangumiMoe
from bgmi.website.mikan import Mikanani

if WEBSITE_NAME == 'mikan_project':
    website = Mikanani()
else:
    website = BangumiMoe()

@trim21
Copy link
Contributor Author

trim21 commented Aug 25, 2017

bangumimoe.py
现在要添加一个数据源只需要从 bgmi.website.base 引入BaseWebsite,然后实现三个方法
filter,存储数据之类的都放在了BaseWebsite里面
在main.py里面添加了几行代码,在第一次启动时选择数据源...

from bgmi.website.base import BaseWebsite


class BangumiMoe(BaseWebsite):
    cover_url = COVER_URL

    def search_by_keyword(self, keyword, count):
        """
        return a list of dict with at least 4 key: download, name, title, episode
        example:
        ```
            [
                {
                    'name':"路人女主的养成方法",
                    'download': 'magnet:?xt=urn:btih:what ever',
                    'title': "[澄空学园] 路人女主的养成方法 第12话 MP4 720p  完",
                    'episode': 12
                },
            ]
        ```
        :param keyword: search key word
        :type keyword: str
        :param count: how many page to fetch from website
        :type count: int
        :return: list of episode search result
        :rtype: list[dict]
        """
        return []

    def fetch_episode_of_bangumi(self, bangumi_id, subtitle_list=None, max_page=MAX_PAGE):
        """
        get all episode by bangumi id
        example
        ```
            [
                {
                    "download": "magnet:?xt=urn:btih:e43b3b6b53dd9fd6af1199e112d3c7ff15cab82c",
                    "name": "来自深渊",
                    "subtitle_group": "58a9c1c9f5dc363606ab42ec",
                    "title": "【喵萌奶茶屋】★七月新番★[来自深渊/Made in Abyss][07][GB][720P]",
                    "episode": 0,
                    "time": 1503301292
                },
            ]
        ```
        :param bangumi_id: bangumi_id
        :param subtitle_list: list of subtitle group
        :type subtitle_list: list
        :param max_page: how many page you want to crawl if there is no subtitle list
        :type max_page: int
        :return: list of bangumi
        :rtype: list[dict]
        """
        return []

    def fetch_bangumi_calendar_and_subtitle_group(self):
        """
        return a list of all bangumi and a list of all subtitle group

        bangumi dict:
        update time should be one of ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']
        example:
        ```
            [
                {
                    "status": 0,
                    "subtitle_group": [
                        "123",
                        "456"
                    ],
                    "name": "名侦探柯南",
                    "keyword": "1234", #bangumi id
                    "update_time": "Sat",
                    "cover": "data/images/cover1.jpg"
                },
            ]
        ```

        subtitle group dict:
        example:
        ```
            [
                {
                    'id': '233',
                    'name': 'bgmi字幕组'
                }
            ]
        ```


        :return: list of bangumi, list of subtitile group
        :rtype: (list[dict], list[dict])
        """

        return [], []

@RicterZ
Copy link
Owner

RicterZ commented Aug 26, 2017 via email

@RicterZ
Copy link
Owner

RicterZ commented Aug 28, 2017

README 加一下 datasource 的配置?

@trim21
Copy link
Contributor Author

trim21 commented Aug 28, 2017

我在readme加过了...

Additional config

DATA_SOURCE: data source now support bangumi_moe`(default) and :code:`mikan_project

@trim21
Copy link
Contributor Author

trim21 commented Aug 28, 2017

刚发现parse_episode出bug了..修复中..

@RicterZ
Copy link
Owner

RicterZ commented Aug 28, 2017 via email

@trim21 trim21 closed this as completed Aug 28, 2017
RicterZ pushed a commit that referenced this issue Jan 29, 2018
RicterZ pushed a commit that referenced this issue Jan 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants