Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add URL matcher for the new Bilibili `BVID` #571

Open
AgFlore opened this issue Mar 25, 2020 · 9 comments
Open

Add URL matcher for the new Bilibili `BVID` #571

AgFlore opened this issue Mar 25, 2020 · 9 comments

Comments

@AgFlore
Copy link
Collaborator

@AgFlore AgFlore commented Mar 25, 2020

Bilibili has started a new video-identifier system called bvid featuring a string starting with "BV". The old "av1234567" aid identifer is still accessible, but no longer the default video URL pattern.
For example, https://www.bilibili.com/video/av6009789 is now equivalent (and usually redirected to) https://www.bilibili.com/video/BV1Qs411k7Qv

Both aid and bvid identifiers are accessible from the api currently used, for example https://api.bilibili.com/x/web-interface/archive/stat?bvid=1Qs411k7Qv and https://api.bilibili.com/x/web-interface/archive/stat?aid=6009789 returns the same.

Generally, the bvid can be generated from aid; basically this involves writing the aid under base58 and xor-ing it with a magic number.

Since there's no tracks of discarding the aid identifier, currently I see no need in migrating our database from aid to bvid; adding a URL matcher shall be sufficient.

(And also about the video-thumb issue: I think adding a referrerpolicy="no-referrer" to our img will fix.)

@riipah

This comment has been minimized.

Copy link
Member

@riipah riipah commented Mar 25, 2020

To save my time, could you check the Bilibili API documentation and see how it has changed? Does the "Video info request" now return bvid?

Note that the "Video info request" used to take the ID part without "av" prefix. How does that work now with bvid? Or is there another API endpoint? I cannot access this "api" URL you provided. What does that return?

Adding URL matcher might not be enough if the API requests need to be changed (different parameter, different API endpoint etc.). As said, I'm mostly interested in this "Video info request" API. Otherwise, I'll need another way to get the same information.

@AgFlore

This comment has been minimized.

Copy link
Collaborator Author

@AgFlore AgFlore commented Mar 25, 2020

http://api.bilibili.com/view?type=json&appkey=8e9fc618fbd41e28&id=6009789 (6009789 is aid) works as usual, and replacing it with bvid doesn't work.

https://api.bilibili.com/x/web-interface/archive/stat?bvid=1Qs411k7Qv now returns:
{"code":0,"message":"0","ttl":1,"data":{"aid":6009789,"bvid":"BV1Qs411k7Qv","view":1966272,"danmaku":57008,"reply":107610,"favorite":127873,"coin":97931,"share":16065,"like":43061,"now_rank":0,"his_rank":28,"no_reprint":0,"copyright":1,"argue_msg":"","evaluation":""}}

If our api doesn't work with bvid, the most straightforward way is to obtain the aid right from bvid ourselves, either with the https://api.bilibili.com/x/web-interface/archive/stat?bvid= api, or just do the base58 calculation ourselves.

@riipah

This comment has been minimized.

Copy link
Member

@riipah riipah commented Mar 25, 2020

I see. Do you know if they are planning on replacing that "Video info request"? And should we still use the "Player request" to get the duration (also called using aid)?

One way to do this is to save both aid and bvid, similar as what we've done with SoundCloud.

or just do the base58 calculation ourselves.

And how do you get the "magic number" needed for the calculation?

@AgFlore

This comment has been minimized.

Copy link
Collaborator Author

@AgFlore AgFlore commented Mar 25, 2020

I see. Do you know if they are planning on replacing that "Video info request"? And should we still use the "Player request" to get the duration (also called using aid)?

I'm not sure about that. Will keep an ear on such issues.

One way to do this is to save both aid and bvid, similar as what we've done with SoundCloud.

Yeah, that's the most secure way. (And if adding multiple parameters is ok, we can also settle #494 .)

or just do the base58 calculation ourselves.

And how do you get the "magic number" needed for the calculation?

The job was done by https://www.zhihu.com/question/381784377/answer/1099438784 (not sure whether Google translation does well for that..) The risk is the algorithm may change in the future.

@riipah

This comment has been minimized.

Copy link
Member

@riipah riipah commented Mar 25, 2020

It's definitely possible to save multiple values, although of course it'd be easier to do it with a single value if possible. But on the other hand, like you said, it's unsafe to assume the algorithm will stay the same.

@AgFlore

This comment has been minimized.

Copy link
Collaborator Author

@AgFlore AgFlore commented Mar 25, 2020

Does the "Video info request" now return bvid?
Or is there another API endpoint?

I would guess yes. Instead of https://api.bilibili.com/view, try the new endpoint https://api.bilibili.com/x/web-interface/view?aid=6009789 , which returns:
{ "code": 0, "message": "0", "ttl": 1, "data": { "bvid": "BV1Qs411k7Qv", "aid": 6009789, "videos": 1, "tid": 30, "tname": "VOCALOID·UTAU", "copyright": 1, "pic": "http://i2.hdslb.com/bfs/archive/8b7dfa54e4cb68dc0815bb95cfcba9a3433a9119.jpg", "title": "【乐正绫原创】世末歌者【PV付/COSMOSⅡ】", "pubdate": 1472134541, "ctime": 1497423941, "desc": "自制\n【词曲:COP/绘:唯Tu/影:saiqomo/logo:少年莫然、实拍素材:莫雪 冰镇甜豆浆/压制:ZHider】\n世末歌者系列,部分系列曲:《世末积雨云》av1579079 ,《回音》av2711298 ,《hello&bye,days》av3402945\n具体走评论区。", "state": 0, "attribute": 32768, "duration": 363, "rights": { "bp": 0, "elec": 0, "download": 1, "movie": 0, "pay": 0, "hd5": 0, "no_reprint": 0, "autoplay": 1, "ugc_pay": 0, "is_cooperation": 0, "ugc_pay_preview": 0, "no_background": 0 }, "owner": { "mid": 396194, "name": "COPY", "face": "http://i2.hdslb.com/bfs/face/544541afa73735172abbe9e6a4bd44100899d48d.jpg" }, "stat": { "aid": 6009789, "view": 1966463, "danmaku": 57008, "reply": 107610, "favorite": 127881, "coin": 97943, "share": 16065, "now_rank": 0, "his_rank": 28, "like": 43069, "dislike": 0, "evaluation": "" }, "dynamic": "", "cid": 35302832, "dimension": { "width": 1280, "height": 568, "rotate": 0 }, "no_cache": false, "pages": [ { "cid": 35302832, "page": 1, "from": "vupload", "part": "", "duration": 363, "vid": "", "weblink": "", "dimension": { "width": 1280, "height": 568, "rotate": 0 } } ], "subtitle": { "allow_submit": false, "list": [] } } }

This endpoint can be accessed without any appkey, and I believe it's newer than our currently using one because new features added to Bilibili in recent years, such as subtitle, like and pay are shown in its returns, and not in the old one.

We can access this api with bvid by simply requesting https://api.bilibili.com/x/web-interface/view?bvid=BV1Qs411k7Qv , which returns the same thing as above. So it seems that really little needs to be changed.

As an completely open API, I'm not sure whether there's a requesting limit for this endpoint, or is there any accessibility issues. Hope not.

@riipah

This comment has been minimized.

Copy link
Member

@riipah riipah commented Mar 25, 2020

Ah I see. The API you linked before had /archive/ and returned much less information. I've been trying to keep a list of available Bilibili APIs, seems that they are changing a lot.

This web-interface API seems to have all the information needed, so we may not need the other APIs anymore. I'm wondering though, is pubdate the video publish date, in epoch (Unix timestamp)? Or ctime?

Anyway, it's not really a "small" change because we need add parsing for a completely new endpoint.

@AgFlore

This comment has been minimized.

Copy link
Collaborator Author

@AgFlore AgFlore commented Mar 25, 2020

I'm wondering though, is pubdate the video publish date, in epoch (Unix timestamp)? Or ctime?

I checked https://api.bilibili.com/x/web-interface/view?aid=78977256 (referring to https://vocadb.net/S/270309), which claims "pubdate": 1579877678, "ctime": 1576139043. Clearly the pubdate is the timestamp that the video is made public for the first time (in this case, Jan 24th, the 2020 lunar new year gala), while the ctime is the timestamp that the a position is created on for this video (in this case, Dec 12th 2019, maybe when the song is decidedly finished and uploaded).

@riipah

This comment has been minimized.

Copy link
Member

@riipah riipah commented Mar 25, 2020

Ok thanks. This should be enough information. As usual, I can't give any guarantees on when I can do this, especially with this corona thing going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.