Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon.jsonrpc_file_read: read claims from a file and download them #3423

Closed
wants to merge 2 commits into from

Conversation

belikor
Copy link
Contributor

@belikor belikor commented Sep 14, 2021

This follows after #3422.

The idea with #3422 is to produce a file with a list of claims. With this pull request we take that written file, parse it to get the claim IDs, and then download each of the streams. The file is a comma-separated values (CSV) file, although by default we use the semicolon ; as separator.

lbrynet file summary --file=sumary.txt
lbrynet file read --file=summary.txt

Basically, the idea is that we can share lists of claims to other users of the LBRY network, and they can import these lists into their own computers (through lbrynet or the LBRY Desktop application) so that they can download the same claims that we have, and thus help seed the same content that we are seeding.

This is a prototype implementation; it works when the number of claims is relatively small; however, once the number of claims is large, more than 500 or so, the Daemon.jsonrpc_file_read method will time out, so it won't finish processing the list. I'm not sure what can be done to make sure it processes a big list without timeouts.

The obvious solution is to not implement this in the SDK itself, but parse the file, and call lbrynet get on each of the claims.

# Pseudocode

lines = parse_file("summary.txt")

for item in lines:
    lbrynet get item["claim_id"]

Then each call to get will be separate from each other, each will have its own timeout.

Also, since the file is meant to contain the 'claim_id', get should be able to handle claim IDs, as proposed in #3411.

This allows printing a list of all claim streams that were downloaded
to the system.
The list is printed to the terminal or to a specific file.

It accepts some parameters to control the information that is printed.
```
lbrynet file summary --blobs --show_channel --title --stream_type --path
lbrynet file summary --show=incomplete --start=10 --end=40
lbrynet file summary --sort=claim_name --reverse --sep=' ;.;'
```

The `--file` option writes the list of claims to a file
which then can be shared with other users of LBRY in order
to download the same claims and contribute to seeding that content.
```
lbrynet file summary --channel=@somechannel --file=summary.txt --fdate
```

By default it will print the date of the claim which is based on the `'claim_height'`,
when the claim was registered in the blockchain, the `'claim_id'`, the `'claim_name'`,
and whether the media is present or not in the download directory.
```
 1/42; 20200610_10:23:37-0500; b231714456ee832daeba4b8356803e7591126dff; "07-S"; no-media
 2/42; 20200610_10:27:06-0500; 31700ff11f900429d742f2f137ba25393bdb3b0a; "09-S"; media
 3/42; 20200609_23:14:47-0500; 70dfefa510ca6eee7023a2a927e34d385b5a18bd; "04-S"; no-media
```
@belikor belikor force-pushed the print-summary-read branch 2 times, most recently from bb4a666 to 147bf42 Compare September 14, 2021 22:50
With `lbrynet file summary` we are able to produce a file with a list
of claims.

With `lbrynet file read` we are able to parse that file,
get the claim IDs, and then download each of the streams.
```
lbrynet file read --file=summary.txt
```
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.5%) to 67.453% when pulling d9acdb8 on belikor:print-summary-read into 561566e on lbryio:master.

@eukreign eukreign assigned lyoshenka and unassigned eukreign Sep 15, 2021
@eukreign
Copy link
Member

@lyoshenka this PR involves API changes, please review

@belikor
Copy link
Contributor Author

belikor commented Sep 20, 2021

it works when the number of claims is relatively small; however, once the number of claims is large, more than 500 or so, the Daemon.jsonrpc_file_read method will time out,

Is there a way to increase the time out? I wonder if I can just pass the --timeout option all the way to the jsonrpc_get method. The idea is that if we pass a file with an arbitrary number of claims, say 5000, the method will process every single item.

@lyoshenka
Copy link
Member

I'd prefer not to add this feature. It can be accomplished with a few lines of scripting, and as you pointed out it doesn't work when there are many claims (at which point you fall back to scripting anyway).

As I said in #3422 (comment), we should aim to keep the API simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants