Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] [Docs] Indicate memory required for installation #18849

Closed
asfimport opened this issue Sep 20, 2021 · 17 comments
Closed

[C++] [Docs] Indicate memory required for installation #18849

asfimport opened this issue Sep 20, 2021 · 17 comments

Comments

@asfimport
Copy link
Collaborator

Would be helpful to add typical memory required for installation. A single core is sufficient for processing power, monitoring with SAR indicates that about 3 Gb of RAM are needed for debug build and 1Gb of RAM for release build.

 

 

Reporter: Benson Muite / @bkmgit
Assignee: Benson Muite / @bkmgit

PRs and other links:

Note: This issue was originally created as ARROW-14039. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
It's not obvious what you are asking for here. Installation does not require any particular amount of memory. Memory consumption at runtime depends solely on your usage the Arrow C++ APIs and on the size of the data you're working on.

Can you clarify what you would like the docs to say concretely?

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
Would add

"A minimal build requires about 4GB of RAM to compile the libraries."

to building.rst  under 'Building Requires' section of System setup.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
But that depends a lot on the compiler, the compile options, and (especially) the chosen parallelism. So it's not a requirement at all.

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
Tried building with g++ compiler on Cent OS 8 for a minimal debug build. Using 1 core and 1Gb compute instance seemed to use a lot of swap, and the build process was very slow. With 4GB of ram it was ok. Have not tried all options, but this seems like a reasonable minimal configuration. Possibly, one can change "requirement" to "recommendation".

 

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
How many cores are on that machine? If you build with only one core (for example by passing the appropriate parameter to ninja or make), 1GB may be enough...

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
Cloud VM used had 1 core and 1GB of memory. Can check if this works for a release build, but it is not optimal for a debug build.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
Hmm, that's interesting. Perhaps you enabled CMAKE_UNITY_BUILD?

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
For the record, I was able to get a release build with a 1GB memory limit in a Docker container when building with two cores. This is using the minimal build example: https://github.com/apache/arrow/tree/master/cpp/examples/minimal_build

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
Thanks. Not sure if CMAKE_UNITY_BUILD was enabled. Will check, used default debug build option.

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
Tested release build on CentOS8, 1 core 1GB RAM. Got to 80% memory use. Measured with SAR on commit 1f481d9 (build fails on commit 6ed712a on CentOS8).

 

 


02:19:22 PM kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty 
02:19:24 PM    632196    685404    360852     36.34         0    179316    294520     29.66    131424    128676         4 
02:19:26 PM    619000    674088    374048     37.67         0    181232    306328     30.85    131512    139716      1868 
02:19:28 PM    616036    674256    377012     37.97         0    184140    305284     30.74    132116    142096      4516 
02:19:30 PM    620084    678972    372964     37.56         0    184800    302036     30.42    132760    136668      5044 
02:19:32 PM    493152    552140    499896     50.34         0    184820    430564     43.36    132964    263604      5056 
02:19:34 PM    570680    634124    422368     42.53         0    189276    347944     35.04    133712    185544      9504 
02:19:36 PM    548892    615392    444156     44.73         0    192296    367076     36.96    134196    206824     12524 
02:19:38 PM    479292    545768    513756     51.74         0    192308    434012     43.71    134200    276252     12536 
02:19:40 PM    525680    595228    467368     47.06         0    195380    387548     39.03    134396    229740     15608 
02:19:42 PM    570260    643960    422788     42.57         0    199492    340588     34.30    134892    184748     19720 
02:19:44 PM    602036    679796    391012     39.37         0    203588    300076     30.22    135544    152164     20776 
02:19:46 PM    483524    561720    509524     51.31         0    204020    419304     42.22    135696    269812     21216 
02:19:48 PM    591452    674096    401596     40.44         0    208468    305400     30.75    136168    161756      8464 
02:19:50 PM    457148    540344    535900     53.97         0    209020    440572     44.37    136308    295720      8976 
02:19:52 PM    587852    676324    405196     40.80         0    214296    305400     30.75    136340    165216     14252 
02:19:54 PM    520156    611052    472892     47.62         0    216720    369008     37.16    136416    232752     16676 
02:19:56 PM    451936    542840    541112     54.49         0    216728    436512     43.96    136432    300784     16684 
02:19:58 PM    493396    587432    499652     50.31         0    219860    394588     39.74    136456    259380     19816 
02:20:00 PM    538044    636384    455004     45.82         0    224200    344696     34.71    136516    214656     24156 
02:20:02 PM    555908    658616    437140     44.02         0    228520    322084     32.43    136632    196752     28476 
02:20:04 PM    457920    561116    535128     53.89         0    229008    419324     42.23    136676    294548     28964 
02:20:06 PM    539136    646980    453912     45.71         0    233652    336600     33.90    136752    213368     33608 
02:20:08 PM    402624    535944    590424     59.46         0    259088    447964     45.11    140508    345904     53320 
02:20:10 PM    540408    679016    452640     45.58         0    264372    300076     30.22    140524    208096      5288 
02:20:12 PM    469344    610392    523704     52.74         0    266796    374252     37.69    140596    279236      7712 
02:20:14 PM    401484    542540    591564     59.57         0    266804    439344     44.24    140608    346968      7720 
02:20:16 PM    442680    586880    550368     55.42         0    269932    391856     39.46    140628    305712     10848 
02:20:18 PM    492952    641468    500096     50.36         0    274268    340832     34.32    140724    255468     15196 
02:20:20 PM    504664    657556    488384     49.18         0    278592    322244     32.45    140840    243552     19520 
02:20:22 PM    409744    563132    583304     58.74         0    279080    416072     41.90    140868    338348     20008 
02:20:24 PM    496464    654464    496584     50.01         0    283688    327816     33.01    140952    251632     24616 
02:20:26 PM    442604    676380    550444     55.43         0    359344    309500     31.17    144948    300984     50308 
02:20:28 PM    442604    676384    550444     55.43         0    359348    309500     31.17    144948    301044     50316 
02:20:30 PM    424152    660560    568896     57.29         0    361796    315420     31.76    145020    318916     27600 
02:20:32 PM    166368    430104    826680     83.25         0    388920    556164     56.01    150020    571864     54616 
02:20:34 PM    135820    400596    857228     86.32         0    389984    582540     58.66    150020    602312     30272 
02:20:36 PM    291828    556164    701220     70.61         0    389516    428032     43.10    150044    446692     29808 
02:20:38 PM    142368    406880    850680     85.66         0    389684    577248     58.13    150052    595680       780 
02:20:40 PM    136248    401728    856800     86.28         0    390708    581804     58.59    150052    601804      1792 
02:20:42 PM    247788    512720    745260     75.05         0    390100    468120     47.14    150064    490464      1184 
02:20:44 PM    318768    583928    674280     67.90         0    390324    398448     40.12    150072    419592      1408 
02:20:46 PM    215388    480672    777660     78.31         0    390448    499816     50.33    150096    522476      1532 
02:20:48 PM    318108    583516    674940     67.97         0    390572    402564     40.54    150108    420172      1656 
02:20:50 PM    189608    455048    803440     80.91         0    390604    529688     53.34    150136    548356      1688 
02:20:52 PM    339432    605144    653616     65.82         0    390912    376536     37.92    150136    398700      1996 
02:20:54 PM    304676    570548    688372     69.32         0    391036    410956     41.38    150148    433556      2120 
02:20:56 PM    263248    529344    729800     73.49         0    391252    452760     45.59    150168    474540      2336 
02:20:58 PM    399352    665520    593696     59.79         0    391324    314592     31.68    150180    338552      2408 
02:21:00 PM    265956    532156    727092     73.22         0    391392    448632     45.18    150184    471920      2476 
02:21:02 PM    139356    405820    853692     85.97         0    391652    576080     58.01    150200    598264      2736 
02:21:04 PM    128196    395472    864852     87.09         0    392428    583928     58.80    150200    609424      3512 
02:21:06 PM    162060    428792    830988     83.68         0    391880    554032     55.79    150204    575540      2964 
02:21:08 PM    314100    581216    678948     68.37         0    392264    398624     40.14    150564    423460      3348 
02:21:10 PM    333840    601060    659208     66.38         0    392404    381020     38.37    150596    403700      2900 
02:21:12 PM    156180    423924    836868     84.27         0    392892    557976     56.19    150612    581108      3392 
02:21:14 PM    390088    658092    602960     60.72         0    393152    323592     32.59    150612    347452      2860 
02:21:16 PM    289048    557232    704000     70.89         0    393328    420948     42.39    150632    448532      3032 
02:21:18 PM    232564    500896    760484     76.58         0    393476    480416     48.38    150640    504840      3180 
02:21:20 PM    369208    637672    623840     62.82         0    393608    341856     34.42    150660    368236      3020 
02:21:22 PM    371772    640388    621276     62.56         0    393796    341148     34.35    150684    365684      3208 
02:21:24 PM    241776    510492    751272     75.65         0    393860    471308     47.46    150716    495664      3272 
02:21:26 PM    156996    426116    836052     84.19         0    394264    553828     55.77    150716    580292      3244 
02:21:28 PM    290496    559532    702552     70.75         0    394176    423920     42.69    150716    447040      3156 
02:21:30 PM    310948    580120    682100     68.69         0    394312    402972     40.58    150728    426528      2968 
02:21:32 PM     82348    351532    910700     91.71         0    394324    631624     63.60    150744    654756      2976 
02:21:34 PM     68844    301360    924204     93.07         0    357720    679688     68.44    140780    680268      3264 
02:21:36 PM     71356    276728    921692     92.81         0    330612    706244     71.12    141356    674996      4164 
02:21:38 PM    313248    511836    679800     68.46         0    323792    470580     47.39    142140    432892      3920 
02:21:40 PM    311960    510656    681088     68.59         0    323900    472772     47.61    142160    434096      3128 
02:21:42 PM    349212    547892    643836     64.83         0    323884    432648     43.57    142168    396956      3112 
02:21:44 PM    302000    500808    691048     69.59         0    324012    482480     48.59    142192    444008      3240 
02:21:46 PM    380984    579872    612064     61.63         0    324092    402788     40.56    142192    365244      2412 
02:21:48 PM    331964    531004    661084     66.57         0    324244    451388     45.45    142216    414148      2568 
02:21:50 PM    206144    405200    786904     79.24         0    324260    577704     58.17    142232    539756      2332 
02:21:52 PM    157124    356900    835924     84.18         0    324984    625336     62.97    142232    588588      3032 
02:21:54 PM    372044    571864    621004     62.54         0    325028    411148     41.40    142232    374196      3076 
02:21:56 PM    253056    453016    739992     74.52         0    325160    528188     53.19    142248    492928      2776 
02:21:58 PM    195988    396504    797060     80.26         0    325756    583012     58.71    142248    549856      3372 
02:22:00 PM    401476    601804    591572     59.57         0    325528    381436     38.41    142256    344752      2676 
02:22:02 PM    324428    524912    668620     67.33         0    325684    454304     45.75    142276    421684      2836 
02:22:04 PM    300668    501444    692380     69.72         0    325976    480996     48.44    142284    445300      3128 
02:22:06 PM    460232    661168    532816     53.65         0    326136    319376     32.16    142292    285880      3288 
02:22:08 PM    313988    515004    679060     68.38         0    326216    468500     47.18    142324    431996      3356 
02:22:10 PM    321196    522560    671852     67.66         0    326564    461668     46.49    142332    424768      3704 
02:22:12 PM    425236    626688    567812     57.18         0    326644    354384     35.69    142376    320812      2988 
02:22:14 PM    405912    607504    587136     59.12         0    326784    374116     37.67    142392    340228      3108 
02:22:16 PM    399648    601312    593400     59.76         0    326844    380840     38.35    142424    346412      2916 
02:22:18 PM    418564    620332    574484     57.85         0    326948    363256     36.58    142444    327320      3016 
02:22:20 PM    397796    599700    595252     59.94         0    327084    381456     38.41    142520    348180      3152 
02:22:22 PM    393860    595932    599188     60.34         0    327252    382932     38.56    142520    352068      2996 
02:22:24 PM    471524    673684    521524     52.52         0    327368    303668     30.58    142548    274308      3116 
02:22:26 PM    287744    490240    705304     71.02         0    327668    494104     49.76    142604    457896      2504 
02:22:28 PM    444360    646872    548688     55.25         0    327680    333344     33.57    142624    301396      2516 
02:22:30 PM    347596    550224    645452     65.00         0    327832    430668     43.37    142636    398212      2668 
02:22:32 PM    345164    547980    647884     65.24         0    327976    435476     43.85    142684    400416      2448 
02:22:34 PM    329852    532800    663196     66.78         0    328108    450600     45.38    142704    415864      2580 
02:22:36 PM    341196    544156    651852     65.64         0    328120    438172     44.12    142736    404548      2164 
02:22:38 PM    346596    549688    646452     65.10         0    328244    431204     43.42    142772    398900      2284 
02:22:40 PM    326496    529876    666552     67.12         0    328528    452964     45.61    142792    418972      2568 
02:22:42 PM    434336    637728    558712     56.26         0    328532    346624     34.91    142896    311180      2004 
02:22:44 PM    323396    526832    669652     67.43         0    328576    469944     47.32    143084    421716      2048 
02:22:46 PM    308844    512444    684204     68.90         0    328740    472264     47.56    143088    436296      1896 
02:22:48 PM    231308    434908    761740     76.71         0    328736    545512     54.93    143100    513692      1892 
02:22:50 PM    312076    515832    680972     68.57         0    328892    467108     47.04    143112    433088      2048 
02:22:52 PM    354040    558012    639008     64.35         0    329108    424596     42.76    143124    391248      2008 
02:22:54 PM    205240    409580    787808     79.33         0    329476    570412     57.44    143124    539704      2376 
02:22:56 PM    282444    486668    710604     71.56         0    329352    493288     49.67    143140    462532      2252 
02:22:58 PM    133824    338048    859224     86.52         0    329352    645628     65.01    143140    610892      1820 
02:23:00 PM     97584    302932    895464     90.17         0    330480    675360     68.01    143140    647036      2952 
02:23:02 PM    408916    613912    584132     58.82         0    330120    368624     37.12    143148    336220      2228 
02:23:04 PM    200116    405212    792932     79.85         0    330240    575384     57.94    143148    544736      2348 
02:23:06 PM    452164    659332    540884     54.47         0    332284    319864     32.21    151084    285268      2584 
02:23:08 PM    400572    621948    592476     59.66         0    346452    357012     35.95    171196    316120     16128 
02:23:10 PM    449768    682884    543280     54.71         0    358184    295248     29.73    185208    253776     28264 
02:23:12 PM    449768    682884    543280     54.71         0    358184    295248     29.73    185208    253776     27980 
02:23:14 PM    449800    682932    543248     54.71         0    358200    295248     29.73    185224    253776     27980 
02:23:16 PM    449832    682964    543216     54.70         0    358200    295248     29.73    185224    253776     27980

 

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
Suggest adding 4Gb recommendation for debug build and 1Gb recommendation for release build.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
@amol- @jorisvandenbossche @westonpace Any opinions about this?

@asfimport
Copy link
Collaborator Author

Weston Pace / @westonpace:
How much merit is there in identifying narrow build requirements? Even 1GB for a release build seems rather onerous to maintain and 4GB for both is probably simpler. The test effort would be complicated because, as you pointed out earlier, there are many factors that could affect build requirements. We may get away with 1GB of RAM on CentOS8 and end up needing 2GB on some other distribution or OS. If someone needs to run in a limited-RAM environment it would make more sense I think to compile on a capable build machine and distribute the binaries.

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
While one of the aims of Arrow is to support large datasets, being able to run in low resource settings enables wider adoption as a standard backend - for example R will download and build Arrow on Linux when one installs the R bindings.

The requirement is not meant to be very precise, more a suggestion as to what to expect. It is possible to add memory use monitoring to the CI builds, though again this would need maintenance.  We want someone installing Arrow (at least the debug build as virtual machines with less than 1Gb are rare) to know that if the build is proceeding very slowly and they have limited RAM, swapping from RAM is the likely reason for the slow build.

@asfimport
Copy link
Collaborator Author

Weston Pace / @westonpace:

being able to run in low resource settings enables wider adoption as a standard backend

I may be misunderstanding still but I think the discussion is about building and not running. I absolutely agree that Arrow should be able to run with minimal memory and that might be worth defining a limit for.

for example R will download and build Arrow on Linux when one installs the R bindings.

I believe R always compiles the bindings but it shouldn't compile arrow-cpp if the package is already present. For example, if the user has already installed the CentOS8 Arrow package from the EPEL. The one exception might be golang (statically compiles everything) but it has pretty strong cross compilation support.

The requirement is not meant to be very precise, more a suggestion as to what to expect. It is possible to add memory use monitoring to the CI builds, though again this would need maintenance. We want someone installing Arrow (at least the debug build as virtual machines with less than 1Gb are rare) to know that if the build is proceeding very slowly and they have limited RAM, swapping from RAM is the likely reason for the slow build.

What if we just add a generic statement:

Arrow C++ is a complex project that needs to handle many different data types, vectorization architectures, and compiler differences. Building Arrow C++ requires a considerable amount of CPU and RAM. When installing Arrow on a system with limited resources we recommend compiling the binaries on a capable build machine or downloading prebuilt binaries from package managers.

If you want to replace "considerable amount of CPU and RAM" with "potentially more than 4GB of RAM" (or insert your number here) I wouldn't really be opposed. I think my concern would be more with a phrase like "at most 4GB of RAM" because we have no way of reliably backing that up other than "On these build machines with these configurations it took less than 4GB" and that isn't really the same thing.

@asfimport
Copy link
Collaborator Author

Benson Muite / @bkmgit:
The statement is helpful. At the moment, installation instructions are in the developer documentation, for C++ and Python.   Python also has instructions for a user install, but the other languages do not. Maybe there should be a standalone generic installation page? Such a page could start with

For most users, downloading prebuilt binaries from package managers should be sufficient. Prebuilt binaries are available at:

list of repositories with hyperlinks

Arrow C++ is a complex project that can handle many different data types, vectorization architectures, and compiler differences, the prebuilt binaries may not fit your use, in which case see the developer documentation for instructions on how to compile Arrow from source.

For developers, possibly the following is enough to add to the C++ and Python pages:

Arrow C++ is a complex project that needs to handle many different data types, vectorization architectures, and compiler differences. Most configurations of Arrow C++ will build with 4 Gb of RAM on a single core CPU. If you need to build Arrow for use on a system with limited resources that does not have prebuilt binaries, we recommend compiling the binaries on a more capable build machine.

 

 

@asfimport
Copy link
Collaborator Author

Jonathan Keane / @jonkeane:
Issue resolved by pull request 11205
#11205

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant