Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GenServer.call Indexer.InternalTransactionFetcher terminating every minute #676

Closed
acravenho opened this issue Sep 5, 2018 · 8 comments

Comments

@acravenho
Copy link
Member

commented Sep 5, 2018

This error is happening every few seconds and causing the catch-up indexer to terminate on the Ethereum Classic chain. It's indexing very slow and is probably caused by this error.

From error.log

04:40:55.756 [error] Task #PID<0.18499.21> started from #PID<0.18448.21> terminating
** (stop) exited in: GenServer.call(Indexer.InternalTransactionFetcher, {:buffer, [%{block_number: 5569262, hash_data: "0x144e4d78a6395f6b15793c86ac23b7a48a0a812d7e01a52a948898bcd67ddf03"}, %{block_number: 5569262, hash_data: "0x1ab5e038f8fb98912f3f2001704ec978ce9e0d9a6004ee05e7a828a8d0380fee"}, %{block_number: 5569259, hash_data: "0x1e071f6f6f64f59da623dea6983d67038b4903055ffb316f30bf13e40da73537"}, %{block_number: 5569262, hash_data: "0x329bc2c9c86ed80cb8bc1980fa271824ea1f2c75d75aaa91a21daaa51691d94a"}, %{block_number: 5569262, hash_data: "0x469252f5f481c476d0cc79fbc5ab8a4f248d83a54882f229e13627fcf792bb01"}, %{block_number: 5569262, hash_data: "0x51180231e67a750e7232e23ea30946892afa44c800c4295736ccf650a074ac56"}, %{block_number: 5569262, hash_data: "0x55aab70eda0718ce64214529b539144c0be44a3893ae1674354e42401891b8ea"}, %{block_number: 5569262, hash_data: "0x5e1f8926f7cd545a0207b6d82475a9838e17128d7d77fd82d59a36542ec567ab"}, %{block_number: 5569257, hash_data: "0x90e1166c288ab3dc2532b4e8ce42ac6eec51a746d074adcd29edcb58d5b4bf61"}, %{block_number: 5569263, hash_data: "0x972c52dbaa2d281f71dc07c0768167014a0665a080be303d8bdc3bf026f42a69"}, %{block_number: 5569263, hash_data: "0x988608c53de71f8daa154a25ce87e0210faab91663de1f565b22751ea672bfcc"}, %{block_number: 5569262, hash_data: "0xa702f2ee1887142acd2b65b1e1024752ebc9202564b2282886f03963ce7eade6"}, %{block_number: 5569262, hash_data: "0xb2b62238647e3507a40151a3beec1b13e3797a15591850c6c7f6ccb35406279f"}, %{block_number: 5569262, hash_data: "0xb44ad27836a7eb0a79c1c200b9ec8575b2619f92e0943010f9cbb31d83124ea5"}, %{block_number: 5569257, hash_data: "0xbcc10aa54539255954a596172d1921ad7622a8d61cae012206603fed434d7b71"}, %{block_number: 5569262, hash_data: "0xc5f1c895941136ee959f91226fb567ec4754285cabd93f2b4af4f718a2d98429"}, %{block_number: 5569262, hash_data: "0xc9425765b5e84d3e2bc112063ea362b1a771d1b396fddc71f3b4b82d9a09c231"}, %{block_number: 5569263, hash_data: "0xcbf7b8dfbb79602682baab9a3d66b6c9efc86b1ffd8e0e02dc054097e3d8f279"}, %{block_number: 5569262, hash_data: "0xd0d6676c414d956a56fb54d38970c3d018500ae74db7e1df113f61b9f36cefb8"}, %{block_number: 5569259, hash_data: "0xdee9b650b980cb02c2167e4b047b7417932f8807b34d2e8f905f6a817e5d821a"}, %{block_number: 5569263, hash_data: "0xea03435281d7adf988267f6c35737f472f63d52d3fb673cb47d3cce5981b346c"}, %{block_number: 5569263, hash_data: "0xeab590a8efba71846f51d04f98f79483703ac7f00854fbd647b49a7b390712ae"}, %{block_number: 5569262, hash_data: "0xf2d6f52a37348c7c6b4940c48d5a81012ef4aa7c4744f83c514451ee574f345b"}, %{block_number: 5569263, hash_data: "0xfe1e4810afe3118b8421a0976369ffcbeb670e3cba21082a69091e757d0ba645"}]}, 10000)
    ** (EXIT) time out
    (elixir) lib/gen_server.ex:924: GenServer.call/3
    (indexer) lib/indexer/block_fetcher/catchup.ex:140: Indexer.BlockFetcher.Catchup.async_import_remaining_block_data/2
    (indexer) lib/indexer/block_fetcher/catchup.ex:107: Indexer.BlockFetcher.Catchup.import/2
    (indexer) lib/indexer/block_fetcher.ex:114: Indexer.BlockFetcher.fetch_and_import_range/2
    (indexer) lib/indexer/block_fetcher/catchup.ex:167: Indexer.BlockFetcher.Catchup.fetch_and_import_range_from_sequence/3
    (elixir) lib/task/supervised.ex:89: Task.Supervised.do_apply/2
    (elixir) lib/task/supervised.ex:38: Task.Supervised.reply/5
    (stdlib) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Function: &:erlang.apply/2
    Args: [#Function<8.121428738/1 in Indexer.BlockFetcher.Catchup.stream_fetch_and_import/2>, [5569266..5569257]]

From indexer.log

04:40:55.777 application=indexer [error] Catchup index stream exited with reason ({:timeout, {GenServer, :call, [Indexer.InternalTransactionFetcher, {:buffer, [%{block_number: 5569293, hash_data: "0x135328a47f1ddb3aa6f66ad4c73bb0fbac679f4eefc5efef4405ad6906c1bc7f"}, %{block_number: 5569291, hash_data: "0x1decb19820853643c643b9ee4a48fc6c2ab7aa7b6a5f997e0a32918b53140120"}, %{block_number: 5569291, hash_data: "0x41f557705779435d4fcc31053d183c75d3df12b7c835db4f733b5a9eecd2a916"}, %{block_number: 5569295, hash_data: "0x4ca32b1c9f36bd89e3df2ec44af02f702845f80a0b0bbdceddb21f66cd5a578f"}, %{block_number: 5569291, hash_data: "0x616b303827dd96314d9d86410b1477dc76fe7e51d282773f692586da12cef80f"}, %{block_number: 5569293, hash_data: "0x7507e16ba2bccc9874d4c1711b81ee47eba724856191cbdee29cd507a587d7ca"}, %{block_number: 5569292, hash_data: "0x7e5513dd169502a9e3693ea33d6590ab5a88d6f35e494ce675e27b8d4c5c8c9e"}, %{block_number: 5569289, hash_data: "0x9896809f00547151d19940ae80069f056defbbeacb92840cd435182bbb8d7b0b"}, %{block_number: 5569293, hash_data: "0xaeaee4bdd267e6213564ade14353da68b168ccfdb5aa9e3989060441ed343dc9"}, %{block_number: 5569287, hash_data: "0xaf4cdef30fc05f69df2319b33c61ae94b7556dc3d7ecca7d2f0c131ea7bf88b5"}, %{block_number: 5569289, hash_data: "0xaf99e74bbbd8842811d5b7fb4104e1bbb63c093f1983127f1e029df3e27a21ad"}, %{block_number: 5569293, hash_data: "0xb8644f84f989c5ea84b1046edbebcca3a8e37f93fcdb155f6713f1b11e50dd32"}, %{block_number: 5569288, hash_data: "0xc12f868dbd37bde8326fb0ed204e7395145bc8ae8c487a946bc4398dcdfdb5b8"}, %{block_number: 5569293, hash_data: "0xd27b2183f41ec0afd371e7499ddd3fa640b2204d49d1458e5c7fa9791d1aa75d"}, %{block_number: 5569290, hash_data: "0xd74159d5bbb0a65b95435c8d76fc559488978e7cb3a600daa75c8be343177ece"}, %{block_number: 5569296, hash_data: "0xd920f94ed71cfd5b5e401a6b2e65bd97ff75fa3cb683502fe4380cc47d04d5fb"}, %{block_number: 5569291, hash_data: "0xd96622c23d87e3123f68386785ac4f3d6c611d082ba3722db357ee21fef79b72"}, %{block_number: 5569293, hash_data: "0xe46b28a7b4ee6018189b6351ada730801ab216fdb58361d7281f23bc4cad4e43"}, %{block_number: 5569287, hash_data: "0xe57b5afe230e5216bb04b83c327abf0997694531db70dfdb8cf4d296f6b81c85"}, %{block_number: 5569295, hash_data: "0xe8b80e2518dfd559c23bddc73b58766e925de19a11d9cec5e30cd8ccb2d75ca0"}, %{block_number: 5569291, hash_data: "0xf1a01b935b8cd382800b46a372ef96561109916ab12dc28ab5aa8424ca297200"}, %{block_number: 5569292, hash_data: "0xf50439ca70ccd90fc4302018b816db459aceeacc3af1244115baadcccfdfa3e4"}, %{block_number: 5569293, hash_data: "0xf8f0301508fe6e53feb44400f2a23cf1c68737ce337eb6ce9ae92491235ee09d"}, %{block_number: 5569291, hash_data: "0xf8fb922d9c69dbfd4dc5f75616ad7f106977f90cf2589f6a248bbe868bb62b9f"}]}, 10000]}}). Restarting
04:41:30.304 application=indexer [error] Catchup index stream exited with reason ({:timeout, {GenServer, :call, [Indexer.InternalTransactionFetcher, {:buffer, [%{block_number: 5569098, hash_data: "0x06cef07a461a11ee7f7d60a36e960cda9b1178e0c5afba1142cc3938fb889c1f"}, %{block_number: 5569106, hash_data: "0x12517a7dc342546aedd3a8bc5b6b53cfa29f2b55cdeb9c61833803287a0b7aea"}, %{block_number: 5569104, hash_data: "0x1252cc642a67d7d76c7c6e128d23a4d9f860d9648b5a9dfc05dc6892f63b3c32"}, %{block_number: 5569103, hash_data: "0x1615845a72e5d7e964742613cae439a1c7ce32db8e724cdd55d441ec2abdfb40"}, %{block_number: 5569098, hash_data: "0x1aa925ac5cfc65d250af2975f373f42aab4b15ff3624dffd7ad3b4c7bd54d02f"}, %{block_number: 5569103, hash_data: "0x1efbaf245effcdfe67989bf5636e3a68e187e13bc67564d122d52a60838bc54d"}, %{block_number: 5569102, hash_data: "0x2a56ccb45ca115367ee32e96912e17124bf619ce5bd9294ba7452911bf1391a4"}, %{block_number: 5569103, hash_data: "0x4323336499b2d6e82861c94bd108110bd005e0c3a15276b977453218ca896e07"}, %{block_number: 5569097, hash_data: "0x46e764f3d359cf9f7f8d9fd06b5d7ca9c85298b1e918d76f282bf7e411b9ed31"}, %{block_number: 5569104, hash_data: "0x4e805378f5419671f335b974fd400df592a53ffc3622805ed20ea4ca1699f956"}, %{block_number: 5569097, hash_data: "0xc4105d43424044157ad72ec2f79670ce1c0941c24b35bc972ec4ddbeda0576df"}, %{block_number: 5569106, hash_data: "0xcd9e6cf23e336c322ff7a346d6cc9740c7f6db79a60c09dd3722016c7aad8af9"}, %{block_number: 5569106, hash_data: "0xd0dafcc03548446e9718a05033eb462f3fbc931c63a55e338b4d06c601aced23"}, %{block_number: 5569105, hash_data: "0xd9bd1242c21e220d953fa00a60424979cfa8231842fe9476ff07a0c059ccaebd"}]}, 10000]}}). Restarting

Environment

  • Operating System: Linux

Steps to reproduce

  1. Index ETC chain.
  2. View error logs.

Expected behaviour

The catchup indexer should run continuously.

Actual behaviour

It's failing every minute or so.

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2018

This would indicate that

|> InternalTransactionFetcher.async_fetch(10_000)

is taking longer than 10 seconds.

Internally, the timeout is used for this GenServer.call

@spec async_fetch([%{required(:block_number) => Block.block_number(), required(:hash) => Hash.Full.t()}]) :: :ok
def async_fetch(transactions_fields, timeout \\ 5000) when is_list(transactions_fields) do
params_list = Enum.map(transactions_fields, &transaction_fields_to_params/1)
BufferedTask.buffer(__MODULE__, params_list, timeout)
end

def buffer(server, entries, timeout \\ 5000) when is_list(entries) do
GenServer.call(server, {:buffer, entries}, timeout)
end

When the call is received in the server

def handle_call({:buffer, entries}, _from, state) do
{:reply, :ok, buffer_entries(state, entries)}
end

defp buffer_entries(state, []), do: state
defp buffer_entries(state, entries) do
%{state | current_buffer: [entries | state.current_buffer]}
end

This is a simple append to a list, which is fast, so the only thing that could take a lot of time is if there is too much data being transferred between processes.

However, it could also be that the handle_call is not even called yet and instead, the InternalTransactionFetcher is too busy doing something else. More investigation is needed.

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

Turning on tracing for the InternalTransactionFetcher shows that the InternalTransactionFetcher eventually responds to the request, so it is simply the timeout being 2 short.

There are a three paths forward:

  1. Simply increase the timeout. Since the timeout is 10 seconds now, anything that hit the timeout before will of course take longer than 10 seconds
  2. Change from GenServer.call to GenServer.cast. This removes blocking of Indexer.BlockFetcher.Catchup, but that means there is no backpressure and InternalTransactionFetcher's memory could get flooded.
  3. Investigate why InternalTransactionFetcher is taking 10 seconds to service the request.

I'm going to continue with (3) for now as it is better longterm as it means understanding why 10 seconds isn't enough to queue the data.

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

Using :eprof, the majority of the time in InternalTransactionFetcher is spent in send:

****** Process <0.461.0>    -- 100.00 % of profiled time *** 
FUNCTION                                                      CALLS        %      TIME  [uS / CALLS]
--------                                                      -----  -------      ----  [----------]
lists:split/2                                                     1     0.00         2  [      2.00]
'Elixir.Enum':'-chunk_every/4-fun-0-'/4                          69     0.00        30  [      0.43]
'Elixir.Enum':take_list/2                                        60     0.00        35  [      0.58]
lists:flatten/1                                                  69     0.00        36  [      0.52]
'Elixir.Indexer.BufferedTask':schedule_next_buffer_flush/1       69     0.00        63  [      0.91]
'Elixir.Stream.Reducers':chunk_every/5                           69     0.00        73  [      1.06]
'Elixir.Enum':chunk_every/4                                      69     0.00        73  [      1.06]
'Elixir.Enum':chunk_while/4                                      69     0.00        74  [      1.07]
'Elixir.Enumerable':reduce/3                                     69     0.00        89  [      1.29]
'Elixir.Enum':reduce/3                                           69     0.00       101  [      1.46]
erlang:max/2                                                     69     0.00       101  [      1.46]
'Elixir.Stream.Reducers':'-chunk_every/5-fun-1-'/4               69     0.00       126  [      1.83]
'Elixir.List':flatten/1                                          69     0.00       145  [      2.10]
'Elixir.Indexer.BufferedTask':flush/1                           138     0.00       166  [      1.20]
'Elixir.Enum':chunk_every/2                                      69     0.00       194  [      2.81]
erlang:send_after/3                                              69     0.00       201  [      2.91]
lists:split/3                                                  2437     0.00       653  [      0.27]
'Elixir.Process':whereis/1                                     2250     0.00      1199  [      0.53]
'Elixir.Enumerable.Map':count/1                                2319     0.00      1245  [      0.54]
'Elixir.Task.Supervisor':async_nolink/3                        2250     0.00      1287  [      0.57]
'Elixir.GenServer':call/3                                      2250     0.00      1339  [      0.60]
'Elixir.Task.Supervisor':async_nolink/5                        2250     0.00      1361  [      0.60]
lists:keyfind/3                                                2250     0.00      1367  [      0.61]
gen:do_call/4                                                  2250     0.00      1486  [      0.66]
'Elixir.Task.Supervisor':async/6                               2250     0.00      1719  [      0.76]
'Elixir.Enumerable':count/1                                    2319     0.00      1920  [      0.83]
'Elixir.Enumerable':impl_for/1                                 2388     0.00      1971  [      0.83]
gen:call/4                                                     2250     0.00      2512  [      1.12]
erlang:whereis/1                                               2250     0.00      2645  [      1.18]
'Elixir.Indexer.BufferedTask':spawn_next_batch/1               2319     0.00      2657  [      1.15]
'Elixir.Indexer.BufferedTask':handle_call/3                    3145     0.00      2791  [      0.89]
gen_server:handle_common_reply/8                               4567     0.00      2945  [      0.64]
'Elixir.Access':get/3                                          2250     0.00      2967  [      1.32]
'Elixir.Process':info/2                                        2250     0.00      3024  [      1.34]
'Elixir.Indexer.BufferedTask':drop_task/2                      2250     0.00      3111  [      1.38]
'Elixir.GenServer':whereis/1                                   2250     0.01      3220  [      1.43]
gen_server:handle_msg/6                                        7713     0.01      3328  [      0.43]
queue:out/1                                                    2250     0.01      3486  [      1.55]
'Elixir.Task.Supervisor':get_info/1                            2250     0.01      3656  [      1.62]
maps:remove/2                                                  2250     0.01      3666  [      1.63]
gen_server:try_handle_call/4                                   3145     0.01      3829  [      1.22]
'Elixir.Enumerable':'impl_for!'/1                              2388     0.01      3838  [      1.61]
gen_server:try_dispatch/4                                      4568     0.01      3886  [      0.85]
'Elixir.Task.Supervisor':async_nolink/2                        2250     0.01      3933  [      1.75]
gen_server:decode_msg/9                                        7713     0.01      4520  [      0.59]
'Elixir.Task.Supervisor':start_child_with_spec/4               2250     0.01      4652  [      2.07]
'Elixir.Enum':count/1                                          2319     0.01      4912  [      2.12]
maps:put/3                                                     2249     0.01      5158  [      2.29]
'Elixir.Enum':'-reduce/3-lists^foldl/2-0-'/3                  21405     0.01      5160  [      0.24]
'Elixir.Indexer.BufferedTask':queue_in_state/3                21336     0.01      5386  [      0.25]
queue:in/2                                                    21336     0.01      5594  [      0.26]
'Elixir.Indexer.BufferedTask':take_batch/1                     2250     0.01      7606  [      3.38]
'Elixir.Indexer.BufferedTask':'-flush/1-fun-0-'/2             21336     0.01      9227  [      0.43]
erlang:monitor/2                                               4499     0.02      9999  [      2.22]
'Elixir.Enum':take/2                                          21336     0.02     10502  [      0.49]
erlang:send/2                                                  2249     0.02     10841  [      4.82]
'Elixir.Indexer.BufferedTask':handle_info/2                    4568     0.02     11087  [      2.43]
lists:reverse/1                                               21345     0.02     11346  [      0.53]
erlang:process_info/2                                          2250     0.02     12147  [      5.40]
gen_server:reply/2                                             3145     0.02     14775  [      4.70]
gen_server:loop/7                                              7712     0.06     38644  [      5.01]
'Elixir.Indexer.BufferedTask':queue_in_queue/3                21336     0.06     38912  [      1.82]
'Elixir.Enumerable.List':reduce/3                            213155     0.09     55351  [      0.26]
erlang:demonitor/2                                             2249     0.15     96083  [     42.72]
'Elixir.Indexer.BufferedTask':buffer_entries/2                 3145     0.16     98186  [     31.22]
'Elixir.Enum':'-chunk_while/4-fun-0-'/3                      213086     0.16     99449  [      0.47]
lists:do_flatten/2                                           219409     0.19    119474  [      0.54]
'Elixir.Stream.Reducers':'-chunk_every/5-fun-0-'/5           213086     0.23    146141  [      0.69]
lists:reverse/2                                               21407     0.27    169811  [      7.93]
gen_server:try_dispatch/3                                      4568     0.61    377158  [     82.57]
queue:len/1                                                    2250     3.40   2117414  [    941.07]
erlang:send/3                                                  2250    94.28  58679566  [  26079.81]
----------------------------------------------------------  -------  -------  --------  [----------]
Total:                                                      1161893  100.00%  62241651  [     53.57]

Using :observer, InternalTransactionFetcher actually has a message queue, which is exceptional, most, processes clear their queues so fast that they show up as 0 in :observer.

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

When using :observer to get the process info for InternalTransactionFetcher, it looks like it is often busy spawning the next batch:

screen shot 2018-09-11 at 9 50 52 am

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2018

Something hinky is going on with the size of the closures from

Task.Supervisor.async_nolink(state.task_supervisor, fn ->
{:performed, state.callback_module.run(batch, retries, state.callback_module_state)}
end)

If I make this change

      f = fn ->
          {:performed, state.callback_module.run(batch, retries, state.callback_module_state)}
        end

      if state.callback_module == Indexer.InternalTransaction.Fetcher do
        IO.puts ["Indexer.InternalTransaction.Fetcher batch closure size / flat size: ", to_string(:erts_debug.size(f)), " / ", to_string(:erts_debug.flat_size(f))]
      end

      task =
        Task.Supervisor.async_nolink(state.task_supervisor, f)

The measured size and flat size of the closure f slowly increases

 Indexer.InternalTransaction.Fetcher batch closure size / flat size: 23888 / 29328
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 24587 / 30189
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 27474 / 33742
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 32497 / 39923
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 36274 / 44570
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 39906 / 49039
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 44008 / 54086
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 48522 / 59641
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 52979 / 65124
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 56078 / 68670
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 56545 / 69245
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 60723 / 74386
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 61461 / 75295
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 65988 / 80866
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 67404 / 82609
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 69643 / 85364
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 72855 / 89317
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 75272 / 92292
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 79970 / 98073
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 83340 / 102220
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 85382 / 104733
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 90409 / 110918
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 90549 / 111091
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 93655 / 114596
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 94852 / 116069
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 100016 / 122424
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 101146 / 123815
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 101730 / 124534
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 107195 / 131259
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 108091 / 132362
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 110391 / 135193
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 112769 / 138120
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 121183 / 148475
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 128005 / 156578
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 130032 / 159073
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 131279 / 160608
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 133895 / 163827
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 133760 / 163662
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 137429 / 168177
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 139222 / 170384
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 140746 / 172259
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 140611 / 172094
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 143872 / 175994
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 143960 / 176103
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 146834 / 179640
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 152124 / 186151
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 156651 / 191722
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 157454 / 192711
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 159535 / 195272
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 162409 / 198644
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 164713 / 201479
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 165477 / 202420
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 170550 / 208663
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 170677 / 208820
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 172782 / 211411
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 178153 / 217877
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 182383 / 223082
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 182248 / 222917
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 184420 / 225590
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 184586 / 225795
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 190469 / 233034
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 190837 / 233372
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 192658 / 235613
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 194033 / 237306
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 200852 / 245697
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 200717 / 245532
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 203929 / 249485
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 204638 / 250242
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 209169 / 255817
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 209034 / 255652
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 211622 / 258837
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 212269 / 259634
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 220080 / 269113
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 219945 / 268948
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 224522 / 274581
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 225494 / 275778
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 230974 / 282521
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 231259 / 282776
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 236005 / 288617
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 237731 / 290742
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 242405 / 296493
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 242270 / 296328
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 246602 / 301545
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 246846 / 301846
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 251884 / 308045
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 251749 / 307880
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 257548 / 315017
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 258799 / 316436
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 265488 / 324667
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 265353 / 324502
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 267460 / 327095
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 267548 / 327204
Indexer.InternalTransaction.Fetcher batch closure size / flat size: 272995 / 333809
KronicDeth added a commit that referenced this issue Sep 12, 2018
Don't use closure for BufferedTask Task.Supervisor.async_nolink call
Fixes #676

For some reason, which I haven't tracked down yet, a closure inside of
`BufferedTask.spawn_next_batch/1` continually grows in size.  When this
closure is passed to `Task.Supervisor.async_nolink`, the
`Task.Supervisor`, for #676, this was `Indexer.TaskSupervisor` and is
now `Indexer.InternalTransaction.TaskSupervisor`, it also grows in
memory size.  I think this is related to the way that functions and
closures are serialized when sent between processes, but I haven't
figured out how to verify that.

The change also cleared the MsgQ for `Indexer.TransactionFetcher`,
probably because its not blocked by the call to
`Indexer.InternalTransaction.Fetcher`, which was blocked due to
something to do with the memory usage of the closures.

The `Indexer.InternalTransaction.TaskSupervisor` consistently shows
multiple tasks under it in `:observer`, which it did not previously
either.
KronicDeth added a commit that referenced this issue Sep 12, 2018
Don't use closure for BufferedTask Task.Supervisor.async_nolink call
Fixes #676

For some reason, which I haven't tracked down yet, a closure inside of
`BufferedTask.spawn_next_batch/1` continually grows in size.  When this
closure is passed to `Task.Supervisor.async_nolink`, the
`Task.Supervisor`, for #676, this was `Indexer.TaskSupervisor` and is
now `Indexer.InternalTransaction.TaskSupervisor`, it also grows in
memory size.  I think this is related to the way that functions and
closures are serialized when sent between processes, but I haven't
figured out how to verify that.

The change also cleared the MsgQ for `Indexer.TransactionFetcher`,
probably because its not blocked by the call to
`Indexer.InternalTransaction.Fetcher`, which was blocked due to
something to do with the memory usage of the closures.

The `Indexer.InternalTransaction.TaskSupervisor` consistently shows
multiple tasks under it in `:observer`, which it did not previously
either.
@josevalim

This comment has been minimized.

Copy link

commented Sep 12, 2018

@KronicDeth Note that you are passing the whole state to the closure and you are not doing that if you are using a MFA. Can you measure the size of state+batch+retries alongside each closure?

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2018

O, 🤦‍♂️ , of course, the closure captures state because the old code wasn't pulling the state fields into separate locals and the state has the big honky :queue in it and even though we don't use the state.buffer :queue, it still is there in the closure being flattened.

@KronicDeth

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2018

This is such a cool 🐛 .

KronicDeth added a commit that referenced this issue Sep 12, 2018
Don't use closure for BufferedTask Task.Supervisor.async_nolink call
Fixes #676

For some reason, which I haven't tracked down yet, a closure inside of
`BufferedTask.spawn_next_batch/1` continually grows in size.  When this
closure is passed to `Task.Supervisor.async_nolink`, the
`Task.Supervisor`, for #676, this was `Indexer.TaskSupervisor` and is
now `Indexer.InternalTransaction.TaskSupervisor`, it also grows in
memory size.  I think this is related to the way that functions and
closures are serialized when sent between processes, but I haven't
figured out how to verify that.

The change also cleared the MsgQ for `Indexer.TransactionFetcher`,
probably because its not blocked by the call to
`Indexer.InternalTransaction.Fetcher`, which was blocked due to
something to do with the memory usage of the closures.

The `Indexer.InternalTransaction.TaskSupervisor` consistently shows
multiple tasks under it in `:observer`, which it did not previously
either.
KronicDeth added a commit that referenced this issue Sep 12, 2018
Don't use closure for BufferedTask Task.Supervisor.async_nolink call
Fixes #676

For some reason, which I haven't tracked down yet, a closure inside of
`BufferedTask.spawn_next_batch/1` continually grows in size.  When this
closure is passed to `Task.Supervisor.async_nolink`, the
`Task.Supervisor`, for #676, this was `Indexer.TaskSupervisor` and is
now `Indexer.InternalTransaction.TaskSupervisor`, it also grows in
memory size.  I think this is related to the way that functions and
closures are serialized when sent between processes, but I haven't
figured out how to verify that.

The change also cleared the MsgQ for `Indexer.TransactionFetcher`,
probably because its not blocked by the call to
`Indexer.InternalTransaction.Fetcher`, which was blocked due to
something to do with the memory usage of the closures.

The `Indexer.InternalTransaction.TaskSupervisor` consistently shows
multiple tasks under it in `:observer`, which it did not previously
either.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.