Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Possible bug with commands being issued multiple times #38

Open
FewKinG opened this Issue Sep 18, 2012 · 2 comments

Comments

Projects
None yet
2 participants

FewKinG commented Sep 18, 2012

I am using active_column (https://github.com/carbonfive/active_column) to manage migrations for my cassandra database and I recently ran into problems running the migrations for an empty database. The migrations failed at random points claiming that the column family that was to be created already existed.

Using debug outputs I ensured that active_column and the cassandra gem only issued the command (add_column_family) once. After some digging I ended up analyzing the network traffic (see below) and found out, that the command was sent to cassandra multiple times with the first execution succeeding but those following failing. This resulted in the described behaviour.

I was using the latest versions from source for the cassandra and active_column gems and version 0.8.2 for thrift_client. Downgrading to thrift_client version 0.8.0 fixed the issue for me, so you may wanna have a look into it.

If you have any further questions or findings feel free to let me know :)

This is an excerpt of the network traffic on port 9160 showing the communication bits that lead to the error.

11:00:24.116385 IP 127.0.0.1.41576 > 127.0.0.1.9160: Flags [P.], seq 114:346, ack 193, win 2048, options [nop,nop,TS val 56004934 ecr 56004907], length 232
E....]@.@..|.........h#.K7...S.D...........
...........<org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider..ng.......schema_migrations.......Standard.......LongType..    ...........A.j........?.........
11:00:24.229654 IP 127.0.0.1.41577 > 127.0.0.1.9160: Flags [P.], seq 55:287, ack 30, win 2049, options [nop,nop,TS val 56004963 ecr 56004962], length 232
E...zC@.@............i#.q......+...........
...........<org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider..ng.......schema_migrations.......Standard.......LongType..    ...........A.j........?.........
11:00:24.232542 IP 127.0.0.1.9160 > 127.0.0.1.41576: Flags [P.], seq 193:277, ack 347, win 2048, options [nop,nop,TS val 56004963 ecr 56004960], length 84
E.....@.@...........#..h.S.DK7.......|.....
.V.c.V.`...P........system_add_column_family..........$48569e50-016f-11e2-0000-242d50cf1fb5.
11:00:24.234937 IP 127.0.0.1.9160 > 127.0.0.1.41577: Flags [P.], seq 30:145, ack 287, win 2048, options [nop,nop,TS val 56004964 ecr 56004963], length 115
E...}q@.@...........#..i...+q..}...........
.V.d.V.c...o........system_add_column_family.............?schema_migrations already exists in keyspace SessionlineStaging..

@FewKinG FewKinG referenced this issue in carbonfive/active_column Sep 18, 2012

Closed

Keyspace/Column family already exists #16

Contributor

ryanking commented Sep 18, 2012

Is possible that this is due to retries. Do you still see it after turning off retries?

FewKinG commented Sep 18, 2012

That also came to my mind. I tried to disable retries in the config file for cassandra/active_column (config/cassandra.yml). Tried 0 and 1 for the number of retries and even increased the retry period. That didn't change anything. I probably should try using the thrift_client gem directly to be certain there are no misconfigurations done by the other gems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment