Do not parse the raw torrent file more than once in the code. #77

Open
jlouis opened this Issue Jan 21, 2011 · 6 comments

2 participants

@jlouis
Owner

We currently have many calls where we read the torrent file into its bcoded internal representation. This is mostly due to laziness. Only decode once, and then pass around the torrent file. One important part is to slowly change identification by FilePath of the torrent into identification by its InfoHash. This probably has to go first, before you can do the real work, namely parse-the-raw-file-only-once.

@edwardw

Looks like etorrent codes use two flavor of InfoHash. One is a binary():
crypto:sha(SomeThing).
All codes other than etorrent_dht submodule use this one.

The other is what etorrent_dht submodule uses, which is decoded from aforementioned binary() one:
integer_id(<BinaryInfoHash:160>) -> InfoHash.

I found such inconsistency confusing. Is there any particular reason to do that? If not, I will make them one.

@jlouis
Owner

No, there is no reason from my point of view. It is worth checking with klaar though, since he is the author of the DHT subsystem. I'd really like to hinge torrent identification on the InfoHash in the future rather than the file location we are using now.

@edwardw

Yes, exactly. I was trying to do that and found this interesting distinction.

@edwardw

Pinged klaar.

@edwardw

Got answer from klaar. Quote: "The DHT subsystem uses an integer as the type for both info hashes and node ids because the DHT protocol defines the distance between nodes, and between nodes and info hashes as the integer value that you get when you 'xor' the two id's. "

After second thought, I'm not going to change the code (make them one) since it works now. Instead, I'll write klaar's answer into comment for future reference.

@jlouis
Owner

+1 for that solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment