Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twisterd crashes with to many followers #24

Closed
gillhofer opened this issue Jan 1, 2014 · 14 comments
Closed

Twisterd crashes with to many followers #24

gillhofer opened this issue Jan 1, 2014 · 14 comments

Comments

@gillhofer
Copy link

System Ubuntu 12.04

I tried to test the capacity of the following system with @AllFollowing.
The twisterd application crashed a few times with 231 followers. I currently don't know the reason for this.

The message on the console is:

terminate called after throwing an instance of 'std::runtime_error'
what(): CLevelDB(): error opening database environment IO error: /home/username/.twister/swarm/7b7528dedXXXXXXXXXXXXXXXXXXXXXXXXXb35c85394/CURRENT: Too many open files

The last entries of the log file are:

00:03:33.101: e34d85: AUTO MANAGER PAUSING TORRENT
00:03:33.101: 24b0cd: AUTO MANAGER PAUSING TORRENT
00:03:33.101: a47e47: AUTO MANAGER PAUSING TORRENT
00:03:33.101: 569e97: AUTO MANAGER PAUSING TORRENT
00:03:33.101: 26ab2c: AUTO MANAGER PAUSING TORRENT
00:03:33.101: c7c210: AUTO MANAGER PAUSING TORRENT
00:03:33.101: 342784: AUTO MANAGER PAUSING TORRENT
00:03:33.101: e9c7d1: AUTO MANAGER PAUSING TORRENT
00:03:33.101: aeb7bc: AUTO MANAGER PAUSING TORRENT
00:03:33.101: b7503b: AUTO MANAGER PAUSING TORRENT
00:03:33.102: f23eed: AUTO MANAGER PAUSING TORRENT
00:03:33.102: a985c6: AUTO MANAGER PAUSING TORRENT
00:03:33.102: c230ce: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 16344d: AUTO MANAGER PAUSING TORRENT
00:03:33.102: daa077: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 50b1ec: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 676d58: AUTO MANAGER PAUSING TORRENT
00:03:33.102: cb87c1: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 9b7cc0: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 01250e: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 5d0d23: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 5d0d23: PAUSING TORRENT
00:03:33.102: 43688f: AUTO MANAGER PAUSING TORRENT
00:03:33.102: 43688f: PAUSING TORRENT
00:03:33.102: 43688f: fastresume data rejected: ret: -1 (0) Success
00:03:33.102: 43688f: set_state() 0
00:03:33.102: fbc78d: creating torrent: walt_disney
00:03:33.102: fbc78d: starting torrent
Opening LevelDB in /home/username/.twister/swarm/fbc78d45c6b983f33d4cd0aa0052beb96e99f25b

@miguelfreitas
Copy link
Owner

wild guess: may you try increasing this?

src/init.cpp
#define MIN_CORE_FILEDESCRIPTORS 150

@gillhofer
Copy link
Author

Yes, seems like your guess was right.

I raised this constant from 150 to 2000.

With this value I managed to follow 251 Users. Now i'm trying to figure out how to automatically follow all users in the block chain. I think i will write a script for this task.

I also should mention that a
MIN_CORE_FILEDESCRIPTORS = 1000
was not sufficient for ~240 follower without triggering a twisterd crash.

@miguelfreitas
Copy link
Owner

btw, how did you manage to follow 251 users? did you set it one by one?

@gillhofer
Copy link
Author

Basically, yes.

I was trying to get as many users from the timeline +
typing "a", "b" etc, in the searchbox and manually hitting these follow buttons .....

ha, wouldn't do it again :D

@cwarden
Copy link

cwarden commented Jan 2, 2014

Shouldn't this work to follow a lot of users?
./twisterd listusernamespartial a 251 | paste -s | xargs -0 -i ./twisterd follow <user> '{}'
It doesn't seem to work, but I'm not sure why.

@gillhofer
Copy link
Author

There is also a bug within the gui on accounts with lots of followers with Firefox, Opera and Chrome (didn't test Safari)

unbenanntes bild

Solution:

    1. Don't show all users at once, and use a "show more" button.
    1. Don't show all users at once, and put the names in a box of fixed size and place a scroll bar in ths box.
    1. Don't show all users at once, and use a page system with 20(?) names. Show page one as default, and let the user jump to other pages with different follwers.
    1. Don't use just one column. Currently there is space for another column of followers.
    1. Expand the white background as far as needed.

I would prefer solution 5 in combination with 4.

miguelfreitas added a commit that referenced this issue Jan 2, 2014
the original value was ok for bitcoin only but now we also need fd's for libtorrent.
@gillhofer
Copy link
Author

./twisterd listusernamespartial c 10000 | xargs -0 ./twisterd follow allfollowing 

echo $? shows "0" afterwards --> The command has succeded.

But as far as i can see, i don't follow this user afterwards. At least the gui doesn't show it.
I currently don't know how the twister internals work exactly, but for each user within this command, a new torrent is created (the corresponding message in the log appears)

ThreadRPCServer method=follow
adding torrent for [attorney,tracker]
00:05:04.262: 01c7a9: creating torrent: attorney
00:05:04.262: 01c7a9: starting torrent
Opening LevelDB in /home/user/.twister/swarm/01c7a982f8f4b48ff81c8fe611a9bb1d43634daf
Opened LevelDB successfully
00:05:04.359: baa21d: AUTO MANAGER PAUSING TORRENT
00:05:04.359: dc9a53: AUTO MANAGER PAUSING TORRENT
.......
00:05:04.363: 01c7a9: AUTO MANAGER PAUSING TORRENT
00:05:04.363: 01c7a9: PAUSING TORRENT
00:05:04.363: 01c7a9: fastresume data rejected: ret: -1 (0) Success
00:05:04.363: 01c7a9: set_state() 0

Has anyone figured it out how the follow command works? Why is "myname" mentionend two times?

./twisterd follow myname '["myname","myfriend"]'

@miguelfreitas
Copy link
Owner

gillhofer i have just increased the limit in repository. it was a great exercise on your side, thanks!

i wouldn't say that just increasing the limit is really a "fix". no matter how high the number we put, there will always going to be a number of users to exceed the new limit. this is why i didn't close the issue.

perhaps we would need deeper changes in codebase like trying to do some sort of roundrobin of the users we follow so their post databases are not kept open all the time. however maybe this is such an unrealistic workload at this point that may not worth the trouble. i don't know.

@cwarden
Copy link

cwarden commented Jan 2, 2014

./twisterd getfollowing myname shows the newly followed users, but they don't show up in the web interface.

@miguelfreitas
Copy link
Owner

gillhofer, the "follow" command of the daemon does just that: starts the torrents for these users so the posts will be downloaded locally and direct messages processed.

there is, however, another command "getposts" that requires the list of users as parameter. this list is kept by the js code in UI.

PS: yes, the UI may be changed to use "getfollowing" to obtain current daemon's list and merge the two.

PS2: somebody should be documenting all that in our wiki pages ;-)

@miguelfreitas
Copy link
Owner

I have been thinking about this. It is not difficult to fix that limit, not at all.

The way I modeled torrent swarms was removing all direct access to files (within a given torrent) and replacing them with a leveldb instance. I don't know much about leveldb, but i'd guess it probably keeps a few file descriptors open, like 3 per torrent or something (i don't know how many).

It is not difficult to replace individual torrent's leveldb instances for a single global leveldb. That is, just one database for all swarms. We just need to concatenate the hash or username when computing the key in leveldb.

libtorrent already seems to do a good job about round-robing the torrents, so file descriptors used for connections shouldn't be an architectural limitation.

@gubatron
Copy link
Contributor

gubatron commented Jan 6, 2014

@miguelfreitas if you're using leveldb, you need to take a look at fastdb I believed implemented by Amir Taaki or someone at DarkWallet, which I think you've met.

It's basically a mmaped hashtable which works up to 3x faster for writes and up to 6x for reads than leveldb.

Code for hashtable_database
https://github.com/genjix/fastdb/blob/master/src/hashtable_database.hpp
https://github.com/genjix/fastdb/blob/master/src/hashtable_database.cpp

and code for the memory mapped file wrapper
https://github.com/genjix/fastdb/blob/master/src/mmfile.hpp
https://github.com/genjix/fastdb/blob/master/src/mmfile.cpp

Write speeds in milliseconds (lower is better)
image

Read speeds in milliseconds
image

and here's the code for the test, leveldb vs mmfile (so you see these aren't bullshit)

Write test with LevelDB https://github.com/genjix/fastdb/blob/master/src/write-lvl.cpp
Write test with FastDB https://github.com/genjix/fastdb/blob/master/src/write-fdb.cpp
Read test with LevelDB https://github.com/genjix/fastdb/blob/master/src/fetch-lvl.cpp
Read test with FastDB https://github.com/genjix/fastdb/blob/master/src/fetch-fdb.cpp

I think they're worth the try, and then on top of that you can keep a single instance of fastdb (mmht), hopefully passed as a parameter every time ;) (globals are bad, macros at the end of method declarations are cool for such things I've seen)

@gubatron
Copy link
Contributor

gubatron commented Jan 6, 2014

Here's more on their findings, I think they'll be using this for their blockchain db
https://wiki.unsystem.net/index.php/Libbitcoin/Blockchain#High_performance_and_low-overhead_database

@miguelfreitas
Copy link
Owner

@gubatron this FastDB seems very nice indeed! but still, i have just commited the change discussed above with LevelDB. you know, i'm supporter of Knuth's "premature optimization is the root of all evil" motto. first we make it work correctly, then later we optimize where required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants