New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeing data loss when running continuous ingest with agitation #441

Closed
keith-turner opened this Issue Apr 25, 2018 · 1 comment

Comments

Projects
None yet
2 participants
@keith-turner
Contributor

keith-turner commented Apr 25, 2018

I did not get a chance to run continuous ingest with agitation before 1.9.0 was released. I ran it after the release and saw some data loss. I have tracked the bug down and it happens when a closed write ahead log is only referenced by minor compacting tablets. When this happens the tablet server may prematurely mark the write ahead log as unreferenced in zookeeper. The following is an example of this bug

  1. tserverA creates WAL1
  2. Data is written to tabletX on tserverA
  3. tserverA creates WAL2
  4. Data is written to tabletX on tserverA
  5. Minor compaction of tabletX starts
  6. tserver A marks WAL1 as unreferenced
  7. tserver A is killed
  8. the master adds only WAL2 to tabletX
  9. tabletX is loaded on tserverB and recovers data from WAL2

In the example above the tablet has data in WAL1, however since the tablet server marked it as unreferenced the master did not assign it to the tablet. The tablet server should only mark WAL1 as unreferenced after the minor compaction finishes, not after it starts.

This bug probably exists in 1.8.0 and later.

@keith-turner

This comment has been minimized.

Contributor

keith-turner commented Apr 25, 2018

Below are some notes I took while debugging this. There were ~600K missing rows out of 24 billion. I focused on one missing row 0481452f58dcde67 and using continuous ingest analysis tools I found :

  • The missing row was written between 18:56:46 and 18:57:05 on 4/23 to tablet 2;0500ec;0480ed.
  • Tablet 2;0500ec;0480ed was assigned to tserver worker3 at the time.

The following are selected logs messages from the tserver on worker 3.

# Begins using WAL 27cb640d-0865-4174-9226-804965ca39f3
2018-04-23 18:55:42,270 [log.TabletServerLogger] INFO : Using next log hdfs://leader1:8020/accumulo/wal/worker3+9997/27cb640d-0865-4174-9226-804965ca39f3

# last successful minor compaction of tablet before tserver is killed
2018-04-23 18:56:54,266 [tabletserver.LargestFirstMemoryManager] DEBUG: COMPACTING 2;0500ec;0480ed  total = 957,588,936 ingestMemory = 957,588,936
2018-04-23 18:56:55,822 [tablet.MinorCompactor] DEBUG: Begin minor compaction hdfs://leader1:8020/accumulo/tables/2/t-00005rd/F0000pzl.rf_tmp 2;0500ec;0480ed
2018-04-23 18:57:00,158 [tablet.Compactor] DEBUG: Compaction 2;0500ec;0480ed 158,183 read | 158,183 written | 44,811 entries/sec |  3.530 secs |    6,261,323 bytes | 1773745.892 byte/sec
2018-04-23 18:57:00,181 [tablet.Tablet] DEBUG: Logs for memory compacted: 2;0500ec;0480ed worker3:9997/hdfs://leader1:8020/accumulo/wal/worker3+9997/27cb640d-0865-4174-9226-804965ca39f3
2018-04-23 18:57:00,181 [tablet.Tablet] DEBUG: Logs for memory compacted: 2;0500ec;0480ed worker3:9997/hdfs://leader1:8020/accumulo/wal/worker3+9997/bf2e3788-e3ca-4ae3-bc1a-b89a3cdbe6b6
2018-04-23 18:57:00,181 [tablet.Tablet] DEBUG: Logs for current memory: 2;0500ec;0480ed worker3:9997/hdfs://leader1:8020/accumulo/wal/worker3+9997/27cb640d-0865-4174-9226-804965ca39f3
2018-04-23 18:57:00,181 [tablet.Tablet] DEBUG: Logs to be destroyed: 2;0500ec;0480ed worker3:9997/hdfs://leader1:8020/accumulo/wal/worker3+9997/bf2e3788-e3ca-4ae3-bc1a-b89a3cdbe6b6

# Begin using WAL ee2bd18d-9d37-4eee-89a9-b15f477063ac
2018-04-23 18:57:00,351 [log.TabletServerLogger] INFO : Using next log hdfs://leader1:8020/accumulo/wal/worker3+9997/ee2bd18d-9d37-4eee-89a9-b15f477063ac

# another minor compaction starts
2018-04-23 18:58:15,115 [tabletserver.LargestFirstMemoryManager] DEBUG: COMPACTING 2;0500ec;0480ed  total = 986,606,421 ingestMemory = 717,298,555

# wal is marked unreferenced
2018-04-23 18:58:17,587 [tserver.TabletServer] INFO : Marking hdfs://leader1:8020/accumulo/wal/worker3+9997/27cb640d-0865-4174-9226-804965ca39f3 as unreferenced

#tablet server is killed before compaction finishes

Later the tablet is loaded on worker 4 and even though the tablet has data in wal 27cb640d-0865-4174-9226-804965ca39f3, it is not used for recovery.

2018-04-23 19:01:13,957 [tablet.Tablet] INFO : Starting Write-Ahead Log recovery for 2;0500ec;0480ed
2018-04-23 19:01:13,960 [log.SortedLogRecovery] INFO : Looking at mutations from hdfs://leader1:8020/accumulo/recovery/25772b00-990b-4372-ae00-2db8584fe117 for 2;0500ec;0480ed
2018-04-23 19:01:14,007 [log.SortedLogRecovery] INFO : Ignoring log file hdfs://leader1:8020/accumulo/recovery/25772b00-990b-4372-ae00-2db8584fe117 appears to be unused by 2;0500ec;0480ed
2018-04-23 19:01:14,007 [log.SortedLogRecovery] INFO : Looking at mutations from hdfs://leader1:8020/accumulo/recovery/ee2bd18d-9d37-4eee-89a9-b15f477063ac for 2;0500ec;0480ed
2018-04-23 19:01:14,524 [log.SortedLogRecovery] INFO : Recovery complete for 2;0500ec;0480ed using hdfs://leader1:8020/accumulo/recovery/25772b00-990b-4372-ae00-2db8584fe117
2018-04-23 19:01:16,342 [log.SortedLogRecovery] INFO : Recovery complete for 2;0500ec;0480ed using hdfs://leader1:8020/accumulo/recovery/ee2bd18d-9d37-4eee-89a9-b15f477063ac
2018-04-23 19:01:16,342 [tablet.Tablet] INFO : Write-Ahead Log recovery complete for 2;0500ec;0480ed (147061 mutations applied, 147061 entries created)
2018-04-23 19:01:16,354 [tablet.Tablet] TABLET_HIST: 2;0500ec;0480ed opened
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment