Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get all clips of a broadcaster #133

Closed
BarryCarlyon opened this issue Jun 8, 2020 · 11 comments
Closed

Can't get all clips of a broadcaster #133

BarryCarlyon opened this issue Jun 8, 2020 · 11 comments
Labels
done The bug or issue has been addressed product: api API endpoints in the "helix" namespace product: docs Documentation on dev.twitch.tv/docs

Comments

@BarryCarlyon
Copy link
Contributor

BarryCarlyon commented Jun 8, 2020

Brief description

Similar to #18 I get to page 11 and the code stops fetching clips

How to reproduce

const fs = require('fs');
const path = require('path');

const got = require('got');

var users = {};

let page = 0;
fetchPage = function(c) {
    got({
        url: 'https://api.twitch.tv/helix/clips',
        searchParams: {
            broadcaster_id: '23735582',
            after: c,
            first: 100
        },
        headers: {
            'client-id': 'b2z6u6b4sg6p84d3pq1ykcehlk1vhh9',
            authorization: 'Bearer AVALIDTOKE'
        },
        responseType: 'json'
    })
    .then(resp => {
        page++;

        for (var x=0;x<resp.body.data.length;x++) {
            users[resp.body.data[x].user_id] = 1;
        }

        console.log('Page',page,'has',resp.body.data.length,'users',Object.keys(users).length);

        fs.appendFileSync(path.join(
            __dirname,
            'pages',
            page + '.json'
        ), JSON.stringify(resp.body,null,4));
        if (resp.body.pagination && resp.body.pagination.cursor) {
            fetchPage(resp.body.pagination.cursor);
        }
    })
    .catch(err => {
        console.log(err);
    });
}
fetchPage('');

Expected behavior

Get all clips for that broadcaster.

Screenshots

image

End of page 11:

        {
            "id": "SpotlessZealousSangPupper",
            "url": "https://clips.twitch.tv/SpotlessZealousSangPupper",
            "embed_url": "https://clips.twitch.tv/embed?clip=SpotlessZealousSangPupper",
            "broadcaster_id": "23735582",
            "broadcaster_name": "Sacriel",
            "creator_id": "65127431",
            "creator_name": "Jaheija",
            "video_id": "176890327",
            "game_id": "493057",
            "language": "en-gb",
            "title": "SacSingle & ShannonSingle PogChamp",
            "view_count": 290,
            "created_at": "2017-09-23T15:23:47Z",
            "thumbnail_url": "https://clips-media-assets2.twitch.tv/26321450576-offset-7934-preview-480x272.jpg"
        }
    ],
    "pagination": {}
}

Additional context or questions

@BarryCarlyon BarryCarlyon added the product: api API endpoints in the "helix" namespace label Jun 8, 2020
@BarryCarlyon
Copy link
Contributor Author

Additional:

Run the same code against kraken, and you get 11 pages also then the code stops

@itsmattthomas
Copy link

I have reproduced this issue, PLS send help.

@willlllllio
Copy link

For people trying to get all clips on channels with more than 1k clips, in the mean time you can use the started_at and ended_at parameters and make multiple requests for smaller time ranges, like one for each month or each week of the past 4 years (Clips launched May 2016).If the results for that time range stop under 1k you've probably got most of the clips in that time range.

If you want better coverage you should also additionally make smaller time range requests to get more clips as time ranged requests are broken and randomly leave out many clips (#48, #80).

Also the docs for started_at and ended_at are wrong and it truncates to 10 min based timestamps (#52) not seconds based.

@JMTK
Copy link

JMTK commented Jun 24, 2020

Yep I was wondering why I was able to jump approximately 10 minutes in timestamps before the clip disappeared from the GET request. I guess I'll have to find a better way to get a more exact list of "new" clips. My original plan to get any new clips that the client hadn't gotten yet was to continuously use new Date().toISOString() but I think I may have to just stop by retaining the last clip ID instead.

@Brandin
Copy link

Brandin commented Jun 24, 2020

@JMTK My workaround was to pull the last 24 hours of clips - after paging through weeks and weeks worth of data for the first import (you may need to reduce this depending on how many you are extracting this data for) and using the "UNIQUE" attribute in my SQL Table Column and appended my query with ON DUPLICATE KEY UPDATE ClipId=ClipId (replace that last column and value according to your codebase)

@lleadbet
Copy link
Contributor

lleadbet commented Sep 9, 2020

To close the loop on this, this was identified as a constraint of the service for Clips. As a result, you can only fetch 1000 per query (using the cursor) before it fails. This has been documented on the dev site here: https://dev.twitch.tv/docs/api/reference#get-clips.

@lleadbet lleadbet closed this as completed Sep 9, 2020
@lleadbet lleadbet added done The bug or issue has been addressed product: docs Documentation on dev.twitch.tv/docs labels Sep 9, 2020
@liamengland1
Copy link

Maybe you should check that your documentation uses the appropriate punctuation

@lleadbet
Copy link
Contributor

lleadbet commented Sep 9, 2020

Flagged to the team, thanks for the heads up @llacb47.

@lleadbet
Copy link
Contributor

lleadbet commented Sep 9, 2020

Circling back, the documentation error will be fixed tomorrow. Thanks again for the flag!

@BarryCarlyon
Copy link
Contributor Author

So in order to get all clips of a broadcaster, one would have to specify a time range and step backwards for each time range?

@lleadbet
Copy link
Contributor

Correct, @BarryCarlyon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done The bug or issue has been addressed product: api API endpoints in the "helix" namespace product: docs Documentation on dev.twitch.tv/docs
Projects
None yet
Development

No branches or pull requests

7 participants