Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Out Of Direct Memory Exception and FileInfo/TransferInfo #77

Merged
merged 4 commits into from Sep 4, 2020

Conversation

fredericBregier
Copy link
Collaborator

@fredericBregier fredericBregier commented Aug 25, 2020

Under certain circumstances, Direct Memory reach a maximum of usage.
This fix tries to limit as possible this impact.

  • File copy: no more usage of Guava where big block where used (512KB) and reduced direct memory (FileChannel)
  • File reading: now using standard buffer and no more direct memory
  • Hash computing: uses a byte array instead of ByteBuf from Netty, except when necessary
  • KeepAlive algorithm fixed where under certain circumstances it failed while long task delay could lead to TimeOut erroneously
  • Free quickier some buffers from Netty
  • For LocalPacket, makes as possible unique buffer from default allocator (pooled allocation). The DataPacket, most often used, is optimized regarding the way of transfer (ByteBuf for receiving, byte array for sending).
  • NetworkPacket was a wrapped buffer of multiple sub-buffers (from LocalPacket), turning it to a single Netty ByteBuf when reasonable.
  • Direct Memory was reduced as much as possible, including Netty configuration itself (see below).
  • One big and long IT test added to check the correctness of this (no Out Of Direct Memory Error detected)

In Netty, most of the ByteBuf are allocated using a Pool Allocator. Depending on usage (IO such as from network, application such as standard allocation), those could be Direct or Heap based. Netty ensures to allocate as needed and handles memory allocation in low consumption, such as Garbage collector is not called too often.
However, if the application uses intensively Direct allocation itself (not through Netty), it could lead to memory issue between Netty and the application.
This fix reduces a lot the Direct allocation out of Netty, increases the usage of Netty Pool Allocator and therefore decrease the memory pressure.

Direct Memory was reduced as much as possible.
Default configuration is adapted at startup but could be override:

  • -Dio.netty.noPreferDirect=true is setup by default, it limits the usage of Direct Buffer except Netty network incoming messages
  • -Dio.netty.maxDirectMemory=0 is setup by default, Netty will use the max JDK Direct memory (recommended).

Recommended value for io.netty.maxDirectMemory is 0. Giving too small value (>0) coud cause issues.

  • Value of -1 is default behavior of Netty and is also a good option for R66 (Netty uses twice Direct memory maximum from JDK without Cleaner).
  • Value of 0 lets Netty allocate within Direct Memory, using JDK max.
  • Value of greater than 0 should not be used (could limit too much memory allocation for Netty) and is really not recommended.

Note however that Direct memory is still used by Netty itself for incoming messages. Disallowing totally Direct buffer could leads to serious issues on memory usage (GC overacting). Therefore, extra arguments such as -Dio.netty.noUnsafe=true is intensively not recommended.

As previously, one could also limit usage of Direct Memory by specifying in XML configuration file in limit section the value usenio to False.

FileInfo/TransferInfo
In some old revision, an inversion was done on FileInfo and TransferInfo.

FileInfo is the information given by the end user through -info and is linked to the file transfer.
This information contains the FollowId field (as Json Map) to allow to forward this info in retransfer.

TransferInfo is the internal information, stored by R66 servers and clients, into database about the transfer.
It could be the original size, some special information (such as Digest, UUID) but also the Follow ID.
Its internal representation is a JSON Map saved as a String (so escaped).

This fixes this partial inversion and also improves the HTML rendering by explicitly presenting the Follow Id.
This fixes also some cases where the Follow Id was ignored during retransfer (TransferTask).

@fredericBregier fredericBregier changed the title Fix oodme v3 Fix Out Of Direct Memory Exception Aug 25, 2020
@fredericBregier fredericBregier requested a review from bcarlin Aug 25, 2020
@fredericBregier fredericBregier added the bug Something isn't working label Aug 25, 2020
@fredericBregier fredericBregier added this to the 3.4.1 milestone Aug 25, 2020
Copy link
Member

@bcarlin bcarlin left a comment

That's a lot of optimizations.
I did not spot any obvious errors... just a few output that I think should be in the logs (I think they already are in netty initialization logs...): it is always unsettling to have output starting with "ERROR" when the server starts ok.

Other than that, in my test, it fixed the OODM errors I had before and stabilized the RAM usage.
good job !

@fredericBregier fredericBregier modified the milestones: 3.4.1, 3.5.0 Aug 28, 2020
In the server XML configuration file, one can specify which addresses (IPs)
to use for the various standard services. By default, as previously, all
interfaces are bound with the specified port. One can therefore limit the
interfaces that are supporting those services.

Examples:

.. code-block:: xml

      <network>
        <serverport>6666</serverport>
        <!-- 1 adresse définie en loop -->
        <serveraddresses>127.0.0.1</serveraddresses>
        <serversslport>6667</serversslport>
        <!-- 2 adresses définies -->
        <serverssladdresses>192.168.0.2,10.1.0.10</serverssladdresses>
        <serverhttpport>8066</serverhttpport>
        <!-- Toutes les interfaces seront utilisées -->
        <serverhttpaddresses/>
        <serverhttpsport>8067</serverhttpsport>
        <!-- 1 adresse définie en local -->
        <serverhttpsaddresses>192.168.0.2</serverhttpsaddresses>
      </network>

.. code-block:: xml

      <network>
        <!-- Toutes les interfaces seront utilisées -->
        <serverport>6666</serverport>
        <serversslport>6667</serversslport>
        <serverhttpport>8066</serverhttpport>
        <serverhttpsport>8067</serverhttpsport>
      </network>
Under certain circumstances, Direct Memory reach a maximum of usage.
This fix tries to limit as possible this impact.

- File copy: no more usage of Guava where big block where used (512KB) and reduced direct memory (FileChannel)
- File reading: now using standard buffer and no more direct memory
- Hash computing: uses a byte array instead of ByteBuf from Netty, except if necessary
- KeepAlive algorithm fixed where under certain circumstances it failed while long task delay could lead to TimeOut erroneously
- Free quickier some buffers from Netty
- For LocalPacket, makes as possible unique buffer from default allocator
- One big and long IT test added to check the correctness of this (no Out Of Direct Memory detected)
- Update dependencies
DataBlock is the most used packet in R66.
This one was using a ByteBuf internal representation (using Netty default).
This could lead in extra pressure to Direct Memory. This fix switch to byte arrays, leading to less
Direct memory used.

Direct Memory was reduced as much as possible.
Default configuration is adapted at startup but could be override:
- `-Dio.netty.noPreferDirect=true` is setup, it limits the usage of Direct Buffer except Netty network incoming messages
- `-Dio.netty.maxDirectMemory=0` is setup, Netty will use the max JDK Direct memory (recommended).

Recommended value for `maxDirectMemory` is 0. Giving too small valud (>0) coud cause issues.
Value of -1 is default behavior of Netty and is also a good option (Netty uses twice Direct memory maximum from JDK without Cleaner).
Value of 0 lets Netty allocate within Direct Memory, using JDK max, but still stable.

Note that the test SecnarioLoopPostgreSqlIT could be launched with various arguments to test:
- `-Xms1024m -Xmx1024m` (or more, such as `2048m`) to limit default memory to use (Heap size)

Note however that Direct memory is still used by Netty itself for incoming messages. Disallow totally Direct buffer could leads to serious issues on memory usage (GC overacting). Therefore, extra arguments such as `-Dio.netty.noUnsafe=true` is intensively not recommended.

As previously, one could also limit usage of Direct Memory by specifying in XML configuration file in `limit` section the value `usenio` to `False`.

NetworkPacket was a wrapped buffer of multiple sub-buffers (LocalPacket such as DataPacket).
This contains also optimizations on NetworkPacket to make only one Buffer from all items.
In some old revision, an inversion was done on FileInfo and TransferInfo.

FileInfo is the information given by the end user through `-info` and is linked to the file transfer.
This information contains the `FollowId` field (as Json Map) to allow to forward this info in retransfer.

TransferInfo is the internal information, stored by R66 servers and clients, into database about the transfer.
It could be the original size, some special information (such as Digest, UUID) but also the Follow ID.
Its internal representation is a JSON Map saved as a String (so escaped).

This fixes this partial inversion and also improves the HTML rendering by explicitly presenting the `Follow Id`.
This fixes also some cases where the Follow Id was ignored during retransfer (`TransferTask`).
@fredericBregier
Copy link
Collaborator Author

fredericBregier commented Sep 1, 2020

@bcarlin All done and checked

@fredericBregier fredericBregier requested a review from bcarlin Sep 1, 2020
@fredericBregier fredericBregier changed the title Fix Out Of Direct Memory Exception Fix Out Of Direct Memory Exception and FileInfo/TransferInfo Sep 2, 2020
@bcarlin
Copy link
Member

bcarlin commented Sep 4, 2020

I guess this PR can be closed, as it is included in #82 ?

@fredericBregier fredericBregier merged commit 9f60346 into waarp:v3.4 Sep 4, 2020
@fredericBregier fredericBregier deleted the fixOODME-V3 branch Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants