Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend fsck diagnostic output to help tracking down duplicate data point cause #477

Closed
IzakMarais opened this issue Apr 1, 2015 · 1 comment

Comments

@IzakMarais
Copy link

After a discussion on the Google group, Manolama requested a log an issue here.

I wrote:

It would be great if the tools for identifying the source of duplicate data be improved. I also have this problem.

Since I know the query that causes the problem, I know the metric name. However I don't know the tags. Running fsck gives some output that might be useful?

2015-03-26 08:07:51,775 ERROR [main] Fsck: More than one column had a value for the same     timestamp: timestamp: (1427357192000)
[24, -125]
[24, -117]

To which Manolama replied

... yeah, if I'm not dumping the row key before or after that line, please open an issue and I'll get it in there. Sorry!

I can confirm: there is no row key before or after the line.

@manolama
Copy link
Member

Ah yeah, this actually fixed in 2.1:

2015-04-19 13:35:58,238 INFO  [Fsck #0] Fsck: More than one column had a value for the same timestamp: (1356998400000 - Mon Dec 31 16:00:00 PST 2012)
    row key: (00000150E22700000001000001)
    write time: (1388534400000)  compacted: (false)  qualifier: [0, 0] <--- keep oldest
    write time: (1388534400001 - Tue Dec 31 16:00:00 PST 2013)  compacted: (false)  qualifier: [0, 11] value: 500.79998779296875

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants