Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Giles should use Options parser instead of positional arguments #4

Closed
SeanTAllen opened this issue Jan 30, 2016 · 1 comment
Closed

Comments

@SeanTAllen
Copy link
Contributor

No description provided.

@SeanTAllen
Copy link
Contributor Author

Given the state of the options package for Pony (which we have discussed replacing), I'm closing this, the code is better with positional args.

slfritchie added a commit to slfritchie/wallaroo that referenced this issue Jul 31, 2018
This patch attempts to work around this GC problem ... but AFAICT it
doesn't help.

(lldb) bt all
* thread WallarooLabs#1: tid = 0, 0x000000000072a680 market-spread`ponyint_heap_mark, name = 'market-spread', stop reason = signal SIGSEGV
  * frame #0: 0x000000000072a680 market-spread`ponyint_heap_mark
    frame WallarooLabs#1: 0x000000000072d007 market-spread`ponyint_gc_markimmutable + 103
    frame WallarooLabs#2: 0x000000000072c078 market-spread`ponyint_mark_done + 24
    frame WallarooLabs#3: 0x000000000073055c market-spread`ponyint_actor_run + 572
    frame WallarooLabs#4: 0x0000000000724862 market-spread`run_thread + 242
    frame WallarooLabs#5: 0x00007fda382b16ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#6: 0x00007fda378c441d libc.so.6`clone + 109
slfritchie added a commit to slfritchie/wallaroo that referenced this issue Jul 31, 2018
…ls are corrupting memory

Compile without any verbose debugging spam:

    /usr/bin/env PATH=/usr/local/pony/0.24.0/bin:$PATH make -j2 PONYCFLAGS="--verbose=1 -d" resilience=on build-testing-performance-apps-market-spread build-giles-all

(lldb) bt all
* thread WallarooLabs#1: tid = 0, 0x000000000072f0d3 market-spread`ponyint_objectmap_getorput + 131, name = 'market-spread', stop reason = signal SIGSEGV
  * frame #0: 0x000000000072f0d3 market-spread`ponyint_objectmap_getorput + 131
    frame WallarooLabs#1: 0x000000000072bd2d market-spread`ponyint_gc_recvobject + 125
    frame WallarooLabs#2: 0x000000000040ec55 market-spread`recovery_RemoteJournalClient_Dispatch + 165
    frame WallarooLabs#3: 0x000000000072fd53 market-spread`ponyint_actor_run + 451
    frame WallarooLabs#4: 0x00000000007240d2 market-spread`run_thread + 242
    frame WallarooLabs#5: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#6: 0x00007fb33f4c741d libc.so.6`clone + 109

  thread WallarooLabs#2: tid = 1, 0x00007fb33f4c7a13 libc.so.6`epoll_wait + 51, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33f4c7a13 libc.so.6`epoll_wait + 51
    frame WallarooLabs#1: 0x0000000000722349 market-spread`ponyint_asio_backend_dispatch + 169
    frame WallarooLabs#2: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#3: 0x00007fb33f4c741d libc.so.6`clone + 109

  thread WallarooLabs#3: tid = 2, 0x00007fb33febe156 libpthread.so.0`__libc_sigwait + 230, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33febe156 libpthread.so.0`__libc_sigwait + 230
    frame WallarooLabs#1: 0x00007fb3361b2b58

  thread WallarooLabs#4: tid = 3, 0x00007fb33feb598d libpthread.so.0`pthread_join + 189, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33feb598d libpthread.so.0`pthread_join + 189
    frame WallarooLabs#1: 0x00000000007291ab market-spread`ponyint_thread_join + 11
    frame WallarooLabs#2: 0x000000000072316a market-spread`ponyint_sched_shutdown + 58
    frame WallarooLabs#3: 0x00000000007234bc market-spread`ponyint_sched_start + 156
    frame WallarooLabs#4: 0x0000000000724db8 market-spread`pony_start + 88
    frame WallarooLabs#5: 0x0000000000721f64 market-spread`main + 308
    frame WallarooLabs#6: 0x00007fb33f3e0830 libc.so.6`__libc_start_main + 240
    frame WallarooLabs#7: 0x0000000000402859 market-spread`_start + 41

  thread WallarooLabs#5: tid = 4, 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45
    frame WallarooLabs#1: 0x00000000007230b3 market-spread`ponyint_cpu_core_pause + 99
    frame WallarooLabs#2: 0x0000000000724454 market-spread`run_thread + 1140
    frame WallarooLabs#3: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#4: 0x00007fb33f4c741d libc.so.6`clone + 109

  thread WallarooLabs#6: tid = 5, 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45
    frame WallarooLabs#1: 0x00000000007230b3 market-spread`ponyint_cpu_core_pause + 99
    frame WallarooLabs#2: 0x0000000000724454 market-spread`run_thread + 1140
    frame WallarooLabs#3: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#4: 0x00007fb33f4c741d libc.so.6`clone + 109
slfritchie added a commit to slfritchie/wallaroo that referenced this issue Jul 31, 2018
…ls are corrupting memory

Compile without any verbose debugging spam:

    /usr/bin/env PATH=/usr/local/pony/0.24.0/bin:$PATH make -j2 PONYCFLAGS="--verbose=1 -d" resilience=on build-testing-performance-apps-market-spread build-giles-all

... However, it doesn't help.  Here's a set of stacktraces with this patch applied.

(lldb) bt all
* thread WallarooLabs#1: tid = 0, 0x000000000072f0d3 market-spread`ponyint_objectmap_getorput + 131, name = 'market-spread', stop reason = signal SIGSEGV
  * frame #0: 0x000000000072f0d3 market-spread`ponyint_objectmap_getorput + 131
    frame WallarooLabs#1: 0x000000000072bd2d market-spread`ponyint_gc_recvobject + 125
    frame WallarooLabs#2: 0x000000000040ec55 market-spread`recovery_RemoteJournalClient_Dispatch + 165
    frame WallarooLabs#3: 0x000000000072fd53 market-spread`ponyint_actor_run + 451
    frame WallarooLabs#4: 0x00000000007240d2 market-spread`run_thread + 242
    frame WallarooLabs#5: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#6: 0x00007fb33f4c741d libc.so.6`clone + 109

  thread WallarooLabs#2: tid = 1, 0x00007fb33f4c7a13 libc.so.6`epoll_wait + 51, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33f4c7a13 libc.so.6`epoll_wait + 51
    frame WallarooLabs#1: 0x0000000000722349 market-spread`ponyint_asio_backend_dispatch + 169
    frame WallarooLabs#2: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#3: 0x00007fb33f4c741d libc.so.6`clone + 109

  thread WallarooLabs#3: tid = 2, 0x00007fb33febe156 libpthread.so.0`__libc_sigwait + 230, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33febe156 libpthread.so.0`__libc_sigwait + 230
    frame WallarooLabs#1: 0x00007fb3361b2b58

  thread WallarooLabs#4: tid = 3, 0x00007fb33feb598d libpthread.so.0`pthread_join + 189, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33feb598d libpthread.so.0`pthread_join + 189
    frame WallarooLabs#1: 0x00000000007291ab market-spread`ponyint_thread_join + 11
    frame WallarooLabs#2: 0x000000000072316a market-spread`ponyint_sched_shutdown + 58
    frame WallarooLabs#3: 0x00000000007234bc market-spread`ponyint_sched_start + 156
    frame WallarooLabs#4: 0x0000000000724db8 market-spread`pony_start + 88
    frame WallarooLabs#5: 0x0000000000721f64 market-spread`main + 308
    frame WallarooLabs#6: 0x00007fb33f3e0830 libc.so.6`__libc_start_main + 240
    frame WallarooLabs#7: 0x0000000000402859 market-spread`_start + 41

  thread WallarooLabs#5: tid = 4, 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45
    frame WallarooLabs#1: 0x00000000007230b3 market-spread`ponyint_cpu_core_pause + 99
    frame WallarooLabs#2: 0x0000000000724454 market-spread`run_thread + 1140
    frame WallarooLabs#3: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#4: 0x00007fb33f4c741d libc.so.6`clone + 109

  thread WallarooLabs#6: tid = 5, 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45, stop reason = signal SIGSEGV
    frame #0: 0x00007fb33febdc1d libpthread.so.0`__GI___nanosleep + 45
    frame WallarooLabs#1: 0x00000000007230b3 market-spread`ponyint_cpu_core_pause + 99
    frame WallarooLabs#2: 0x0000000000724454 market-spread`run_thread + 1140
    frame WallarooLabs#3: 0x00007fb33feb46ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#4: 0x00007fb33f4c741d libc.so.6`clone + 109
slfritchie added a commit to slfritchie/wallaroo that referenced this issue Aug 7, 2018
Squash of slf-file-io-journal3-take2 branch at commit fd32554 2018-07-10 + merge fixes

WIP: all I/O is now journalled, yay!

Also:
WIP: serialize writev() though with some extra (?) copying
Add dump-journal.py

WIP: thread through AmbientAuth down to journal guts

WIP: switch to journal-only writes

WIP: log rotation goop

WIP: add --resilience-no-local-file-io flag + thread it through everywhere

WIP: set _the_journal in multi-worker case, oops

Fix recovery startup bugs, found by 'make integration-tests-testing-correctness-tests-recovery'

WIP: add do_local_file_io/--resilience-no-local-file-io flag control over EventLog files

WIP: add async notification trait SimpleJournalAsyncResponseReceiver

WIP: no spam on async_io_ok()

Dumb skeleton of utils/dos-dumb-object-service/dos-server.py

dos-server.py strawman is complete

WIP: racy DOSclient actor

Hrrrrm, I thought that the TCP code will buffer data
that is sent before connection.  But that doesn't
appear to be the case.  (Perhaps it's the Wallaroo
forks that do the buffering but stdlib does not?)
AAAAhhhhhh, correct, stdlib's write() and writev()
both start with `if _connected and...`

When the @sleep is uncommented to avoid the race:

    SOCK: I am connected.
    DOSc: send_ls
    SOCK: sent
    SOCK: received header
    SOCK: received payload
    DOSclient GOT:bar	no
    baz	no
    foo	no

Without the sleep:

    DOSc: send_ls
    SOCK: I am connected.

WIP: trying a Promise, hold my beer

WIP: really, change of Any -> 'type DOSreply' works, really?

WIP: minimal promise hack, add file size to 'ls'

WIP: parse ls response

WIP: fully parsed results for ls, yay!

Add file offset & desired length to get

WIP: Add do_get_chunk() to DOSclient

WIP: get_file() to fetch entire file works, but AFAIK not appropriate for any error case

WIP: entire file transfer with Promise at end of success

However, I think that something isn't working correctly
in the failure case.  Needs more checking

Output:

SOCK: received header
SOCK: received payload
PROMISE: 0x12a4f5f00: Yay, p1 chunk at offset 0
PROMISE: 0x45330: I got a chunk of size 10
>>>This is th<<<
PROMISE: 0x12a4f5ec0: Yay, p0 chunk at offset 0
SOCK: received header
SOCK: received payload
PROMISE: 0x12a4f5f80: Yay, p1 chunk at offset 10
PROMISE: 0x45330: I got a chunk of size 10
>>>e bar, yes<<<
PROMISE: 0x12a4f5f40: Yay, p0 chunk at offset 10
SOCK: received header
SOCK: received payload
PROMISE: 0x1324e7800: Yay, p1 chunk at offset 20
PROMISE: 0x45330: I got a chunk of size 10
>>>, it
defin<<<
PROMISE: 0x12a4f5fc0: Yay, p0 chunk at offset 20
SOCK: received header
SOCK: received payload
PROMISE: 0x1324e7880: Yay, p1 chunk at offset 30
PROMISE: 0x45330: I got a chunk of size 10
>>>itely is,
<<<
PROMISE: 0x1324e7840: Yay, p0 chunk at offset 30
SOCK: received header
SOCK: received payload
PROMISE: 0x1324e7900: Yay, p1 chunk at offset 40
PROMISE: 0x45330: I got a chunk of size 7
>>>world!
<<<
PROMISE: 0x1324e78c0: Yay, p0 chunk at offset 40
PROMISE BIG: 0x1324e7ae0: yay
PROMISE: 0x45340: entire file transfer success for true num_chunks 5

WIP: Hmmmmm, it looks like join + next + reject = next's failure lambda not running??

WIP: hooray, failure promise is doing the right thing w/Ponyc 0.23.0

WIP: promises experiment with generics, how bad can it be?

WIP: broken

WIP: flailing but getting closer to goal

WIP: revert 2, then go forward ... but socket connect is broken, oops

WIP: revert _reconn attempt.  TODO: Re-do but without bugz ^_^

WIP: working well in good case, but lots of BUMMER & ERROR TODO races??

Client junk is racy with asleep schedulers & weird delays, but I think client behavior & cancellations are all good?

De-spam.  Also, output examples:

% ./dos-dumb-object-service
BEFORE SLEEP 1
PROMISE: I got array of size 3
	bar,47,false
	bar2,47,false
	etc-hosts,214,false
PROMISE BIG: BOOOOOO
PROMISE: file transfer for bar was success false num_chunks 0
AFTER SLEEP 1
PROMISE: I got array of size 3
	bar,47,false
	bar2,47,false
	etc-hosts,214,false

% ./dos-dumb-object-service
BEFORE SLEEP 1
AFTER SLEEP 1
PROMISE: 2 BUMMER!
PROMISE: 1 BUMMER!
PROMISE BIG: BOOOOOO
PROMISE: file transfer for bar was success false num_chunks 0

WIP: change put -> append, add thread safety to 'appending' dict

Add periodic written+fsync updates to the server

Fixes to periodic written+fsync updates to the server

Add 'debug' global var for debug spam

Add do_delete

Add fsync protocol support

WIP: syncer setup

WIP: syncer setup, chasing compiler bug

Fix (I hope) dumb server side socket timeout config

WIP: a big mess

WIP: less mess, first steps of state machine now here

WIP: remote append started, see output

AsyncJournalMirror: create
AsyncJournalMirror: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: /Users/scott/s/src/wallaroo/asdf.journal size 0
AsyncJournalMirror: remote_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
PROMISE: remote_size_discovery BUMMER!
AsyncJournalMirror: remote_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
PROMISE: remote_size_discovery BUMMER!
DOS: connected
STAGE 10: done
TIMER: counter 1
AsyncJournalMirror: remote_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
PROMISE: I got array of size 5
	 asdf.journal,0,false
	 bar,47,false
	 bar2,47,false
	 etc-hosts,214,false
	 stream1,4933033,false
AsyncJournalMirror: start_remote_file_append for /Users/scott/s/src/wallaroo/asdf.journal
AsyncJournalMirror: start_remote_file_append _local_size 0 _remote_size 0
AsyncJournalMirror: start_remote_file_append RES ok

TIMER: counter 0
TIMER: counter 1
TIMER: counter 2
TIMER: counter 3
TIMER: counter 4
TIMER: counter 5
TIMER: counter 6
TIMER: counter 7
TIMER: counter 8
TIMER: counter 9
TIMER: counter 10
TIMER: counter expired, stopping

foo

Attempt to introduce SimpleJournalBackend

WIP: prep

Substitute File -> SimpleJournalBackend for local file only

WIP: small refactor before sending journal data to remote(s)

WIP: about ready to try sending journal data to remote

End of day: zero TCP error case seems to work @ 1 remote!

WIP: adding disconnected & reconnect tango

WIP: adding disconnected & reconnect tango, part II

WIP: adding disconnected & reconnect tango, part III

WIP: all seems to work except actual catch-up byte copies, todo next

WIP: before starting catch-up, example output below

SimpleJournalBackendRemote: be_writev offset 291 data_size 97
RemoteJournalClient: be_writev offset 291 data_size 97
DOS: disconnected
UUUGLY: DISconnected conn = 0x128d7e400
RemoteJournalClient: lambda notifier = false
UUUGLY: _sock = 0x130d64800
SOCK: sentv @ crashme 0
RemoteJournalClient: dos_client_connection_status false
RemoteJournalClient: _state 50
RemoteJournalClient: local_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
RemoteJournalClient: /Users/scott/s/src/wallaroo/asdf.journal size 388
RemoteJournalClient: remote_size_discovery for /Users/scott/s/src/wallaroo/asdf.journal
DOS: connected
UUUGLY: connected conn = 0x130d64800
RemoteJournalClient: lambda notifier = true
RemoteJournalClient: dos_client_connection_status true
RemoteJournalClient: _state 20
SOCK: sent @ crashme 5
	Found it
	asdf.journal,291,false
RemoteJournalClient: start_remote_file_append for /Users/scott/s/src/wallaroo/asdf.journal
RemoteJournalClient: start_remote_file_append _local_size 388 _remote_size 291
SOCK: sent @ crashme 4
RemoteJournalClient: start_remote_file_append RES ok
RemoteJournalClient: catch_up_state _local_size 388 _remote_size 291
TIMER: counter 4
SimpleJournalBackendRemote: be_writev offset 388 data_size 97
RemoteJournalClient: be_writev offset 388 data_size 97
RemoveJournalClient: TODO not in_sync, need buffering scheme!
TIMER: counter 5
SimpleJournalBackendRemote: be_writev offset 485 data_size 97
RemoteJournalClient: be_writev offset 485 data_size 97
RemoveJournalClient: TODO not in_sync, need buffering scheme!
TIMER: counter 6
SimpleJournalBackendRemote: be_writev offset 582 data_size 97
RemoteJournalClient: be_writev offset 582 data_size 97
RemoveJournalClient: TODO not in_sync, need buffering scheme!
TIMER: counter expired, stopping

Yay, basic catch-up is working, I think, but probably still racy & buggy

Yay, basic catch-up is working, I think, still racy & buggy, part II

WIP: @printf spam everywhere to avoid _env.print sync weirdness

Yay, basic catch-up is working, I think, still racy & buggy, part III

WIP: WTF, 'l' commands (plus framing header) are mixed into file append stream, sometimes

WIP: tcpdump confirms, 'l' commands (plus framing header) are mixed into file append stream, sometimes

WIP: fix one bug with _state checking, but still racy: Fail() at line 257, sometimes

WIP: fix another bug with _state checking, still racy??

dispose() working now, toy program halts & exits the Pony runtime

WIP: now to try to add writev buffering scheme

WIP: buffering scheme works but is racy: ls command appears in append data stream!

WIP: racy dealing with reconnect but still buggy {sigh}

WIP: racy dealing with reconnect but still buggy, take 2 {sigh}

WIP: racy dealing with reconnect but still buggy, take 3 {sigh}

WIP: racy dealing with reconnect but still buggy, take 4 {sigh}

WIP: change start_remote_file_append condition from Fail() to ignore

WIP: spam

Too much concurrent async activity happening here with Promises, needs new approach.

WIP: refactor behavior funs into private _funs

WIP: add symbolic names to state machine states

WIP: finish symbolic names + use primitive

WIP: finish switch to advise_state_change()

WIP: bugfix?

WIP: possible bugfix for catchup?, dbg spam adjustments

WIP: possible bugfix for catchup?, dbg spam adjustments

WIP: possible bugfix for catchup?, dbg spam adjustments

WIP: possible bugfix for _connected?, dbg spam adjustments

WIP: possible bugfix for _in_sync?, dbg spam adjustments

WIP: possible bugfix for _send_buffer_state?, dbg spam adjustments

WIP: possible bugfix adding _appending state (after avoiding adding it, it seems necessary)

WIP: possible bugfix adding _appending state to DOSclient {sigh}

WIP: splitting _SStartRemoteFileAppend state, long overdue

WIP: splitting _SRemoteSizeDiscovery state, long overdue

WIP: race chasing & fixing

WIP: replace rare  Fail() condition inside _send_buffer_state  with _make_new_dos_then_local_size_discovery()

WIP: state change fix: catchup -> _make_new_dos_then_local_size_discovery

First try at buffer size limit

WIP: fix race (remove Fail()) in catch-up ending vs writev

WIP: big refactoring to be very event-oriented. Compiles. Not tested

WIP: bugfixes for corrupted files to due _connected state mismanagement

WIP: remove old commented hunks

WIP: bugfix: add missing remote_size_discovery_reply _state sanity check

WIP

Hooray, this FSM simplification has passed over 2K test iterations (see instructions below)

To run the server

    mkdir ./yodel
    python2 ./utils/dos-dumb-object-service/dos-server.py ./yodel

To compile the test client:

    ./utils/dos-dumb-object-service

To run the test client in a test loop:

    date ; ( for i in `seq 1 2000`; do ./../../../../../../../bin/rm -fv asdf* ; rm -fv yodel/asdf* ; ./dos-dumb-object-service --ponyminthreads=8 --ponythreads=8 asdf 2>&1 | tee foo.out ; grep "====" foo.out ; /bin/echo -n "cmp says: " ; cat /dev/null | cmp asdf* yodel/asdf* ; if [ $? -ne 0 ]; then echo STOP; exit 4; fi; done ) > foo.all.out 2>&1 ; cat foo.all.out |egrep 'STOP|hey' ; date

Attempt to beautify the @printf spam

Split dos_client.pony into multiple files

WIP: minor refactoring, add timeouts to each Promise in RemoteJournalClient

Fix debug @printf spam formatting errors.

Bugfixes for timeout handling

Add TCP keepalive. If throttled more than 2 seconds, then disconnect.

WIP: add usedir protocol command, not yet working

Add usedir protocol command, sortof works now

WIP: timeout tweaks for testing only

WIP

WIP: refactor debugging messages

WIP: start handling written/synced stats from server

Finish handling written/synced stats from server

WIP: more spam

WIP: spam shortening

Remove _rjc: None type as an option for DOSclient

foo

Add usage comments to the DOS client

Add a remove() behavior to SimpleJournal, use it in Startup._remove_file()

Move SimpleJournal code to lib/wallaroo/ent/recovery/simple_journal.pony

WIP: switch to SimpleJournalMirror scheme with remote client stubbed

WIP: alphabetize

WIP: common code refactor

WIP: Move several utils/dos*/*.pony to lib/wallaroo/ent/recovery

WIP: DOSclients added but are broken

WIP: DOSclients added but are broken

WIP: DOSclients added but are broken

Whoops! Fix file descriptor leak + check for file open errors!

WIP: debug spam

WIP: debug spam

Bugfix: avoid HUGE HUGE concurrency headache by avoiding automatic reconnect by DOSclient

GC bug workaround attempt: during catchup, yield scheduler periodically

This patch attempts to work around this GC problem ... but AFAICT it
doesn't help.

(lldb) bt all
* thread WallarooLabs#1: tid = 0, 0x000000000072a680 market-spread`ponyint_heap_mark, name = 'market-spread', stop reason = signal SIGSEGV
  * frame #0: 0x000000000072a680 market-spread`ponyint_heap_mark
    frame WallarooLabs#1: 0x000000000072d007 market-spread`ponyint_gc_markimmutable + 103
    frame WallarooLabs#2: 0x000000000072c078 market-spread`ponyint_mark_done + 24
    frame WallarooLabs#3: 0x000000000073055c market-spread`ponyint_actor_run + 572
    frame WallarooLabs#4: 0x0000000000724862 market-spread`run_thread + 242
    frame WallarooLabs#5: 0x00007fda382b16ba libpthread.so.0`start_thread + 202
    frame WallarooLabs#6: 0x00007fda378c441d libc.so.6`clone + 109

WIP: fix stack depth problem during catch-up phase, even with pathologically small write sizes

WIP: fix stack depth problem during catch-up phase, even with pathologically small write sizes

Bugfix: if remote TCP connection fails, retry, eh?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant