Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory restrictions caused failure TLJH install failure on AWS EC2 #518

Closed
charlesfarr opened this issue Feb 25, 2020 · 8 comments
Closed
Labels
bug Something isn't working

Comments

@charlesfarr
Copy link

charlesfarr commented Feb 25, 2020

I have tried multiple times to install TLJH on an AWS EC2 instance, failing every time. The process has thrown a number of different errors, but the most common appears to be with conda installation.

Error logs

ubuntu@ip-172-31-91-142:~$ curl https://raw.githubusercontent.com/jupyterhub/the-littlest-jupyterhub/master/bootstrap/bootstrap.py | sudo -E python3 - --admin cfarr
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  6354  100  6354    0     0  45385      0 --:--:-- --:--:-- --:--:-- 45385
Checking if TLJH is already installed...
Setting up hub environment
Installed python & virtual environment
Set up hub virtual environment
Setting up TLJH installer...
Setup tljh package
Starting TLJH installer...
Setting up admin users
Granting passwordless sudo to JupyterHub admins...
Setting up user environment...
Downloading & setting up user environment...
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/installer.py", line 508, in <module>
    main()
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/installer.py", line 491, in main
    ensure_user_environment(args.user_requirements_txt_url)
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/installer.py", line 272, in ensure_user_environment
    'conda==' + conda_version
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/conda.py", line 109, in ensure_conda_packages
    ] + packages).decode()
  File "/usr/lib/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/opt/tljh/user/bin/python', '-m', 'conda', 'install', '-c', 'conda-forge', '--json', '--prefix', '/opt/tljh/user', 'conda==4.8.1']' died with <Signals.SIGKILL: 9>.

I was following the guide to install on your own server and had run the following:

sudo -E apt install python3 python3-dev git curl

which failed, Ubuntu suggest I run the following:

sudo apt-get update

which succeeded, so I re-ran the following:

sudo -E apt install python3 python3-dev git curl

which succeeded, so I ran the TLJH install:

curl https://raw.githubusercontent.com/jupyterhub/the-littlest-jupyterhub/master/bootstrap/bootstrap.py | sudo -E python3 - --admin cfarr

which threw the above error.

Any and all help is appreciated!

@GeorgianaElena
Copy link
Member

Hey @charlesfarr! I've seen a similar error in our integration tests when we were not ensuring a minimum of 1GB of RAM (ref: #479).
Not sure if this is the case here, but I think it's worth investigating.

Also, checkout the Installing on Amazon Web Services guide as it's more specific to your use case than the guide to install on your own server.

Hope this helps.

@twrobinson
Copy link

twrobinson commented Mar 23, 2020

Hi, the read the docs states this "The AWS free tier is fully capable of running a minimal littlest Jupyterhub for testing purposes.", but to me it seems that the 1 GB free tier instance on AWS (t2.micro) is not sufficient, or is at best "barely sufficient", to install tljh, and the installation is likely to fail in unpredictable ways.

Installing from User Data results in a hang in "Setting up user environment...".

Installing from a terminal with "curl https://raw.githubusercontent.com/jupyterhub/the-littlest-jupyterhub/master/bootstrap/bootstrap.py | sudo -E python3 - --admin "

resulted in

Downloading & setting up user environment...
Traceback (most recent call last):
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/exceptions.py", line 1062, in __call__
    return func(*args, **kwargs)
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/cli/main.py", line 84, in _main
    exit_code = do_call(args, p)
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/cli/conda_argparse.py", line 82, in do_call
    exit_code = getattr(module, func_name)(args, parser)
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/cli/main_install.py", line 20, in execute
    install(args, parser, 'install')
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/cli/install.py", line 308, in install
    handle_txn(unlink_link_transaction, prefix, args, newenv)
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/cli/install.py", line 333, in handle_txn
    unlink_link_transaction.download_and_extract()
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/core/link.py", line 191, in download_and_extract
    self._pfe.execute()
  File "/opt/tljh/user/lib/python3.7/site-packages/conda/core/package_cache_data.py", line 661, in execute
    raise CondaMultiError(exceptions)
conda.CondaMultiError: Error with archive /opt/tljh/user/pkgs/conda-4.8.1-py37_0.conda.  You probably need to delete and re-download or re-create this file.  Message from libarchive was:

Zstd decompression failed: Allocation error : not enough memory (errno=-1, retcode=-30, archive_p=94800679832192)

with subsequent attempts getting further:

Downloading traefik 1.7.18...


Ran /opt/tljh/user/bin/jupyter lab build --minimize=False --dev-build=False with exit code 1
[LabBuildApp] JupyterLab 1.2.7
[LabBuildApp] Building in /opt/tljh/user/share/jupyter/lab
[LabBuildApp] Building jupyterlab assets (build:prod)
An error occured.
RuntimeError: JupyterLab failed to build
See the log file for details:  /tmp/jupyterlab-debug-fykwpamn.log

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/installer.py", line 508, in <module>
    main()
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/installer.py", line 496, in main
    ensure_jupyterlab_extensions()
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/installer.py", line 194, in ensure_jupyterlab_extensions
    ] + build_options)
  File "/opt/tljh/hub/lib/python3.6/site-packages/tljh/utils.py", line 32, in run_subprocess
    raise subprocess.CalledProcessError(cmd=cmd, returncode=proc.returncode)
subprocess.CalledProcessError: Command '['/opt/tljh/user/bin/jupyter', 'lab', 'build', '--minimize=False', '--dev-build=False']' returned non-zero exit status 1.

where the log file ends in:

[LabBuildApp] yarn run v1.15.2
$ ensure-max-old-space webpack --config webpack.prod.config.js
child_process.js:650
    throw err;
    ^

Error: Command failed: /opt/tljh/user/share/jupyter/lab/staging/node_modules/.bin/webpack --config webpack.prod.config.js
    at checkExecSyncError (child_process.js:629:11)
    at Object.execFileSync (child_process.js:647:13)
    at Object.<anonymous> (/opt/tljh/user/share/jupyter/lab/staging/node_modules/@jupyterlab/buildutils/lib/ensure-max-old-space.js:38:17)
    at Module._compile (internal/modules/cjs/loader.js:778:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
    at Module.load (internal/modules/cjs/loader.js:653:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:593:12)
    at Function.Module._load (internal/modules/cjs/loader.js:585:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
    at startup (internal/bootstrap/node.js:283:19)
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

[LabBuildApp] JupyterLab failed to build
[LabBuildApp] Traceback (most recent call last):

[LabBuildApp]   File "/opt/tljh/user/lib/python3.7/site-packages/jupyterlab/debuglog.py", line 47, in debug_logging
    yield

[LabBuildApp]   File "/opt/tljh/user/lib/python3.7/site-packages/jupyterlab/labapp.py", line 98, in start
    command=command, app_options=app_options)

[LabBuildApp]   File "/opt/tljh/user/lib/python3.7/site-packages/jupyterlab/commands.py", line 459, in build
    command=command, clean_staging=clean_staging)

[LabBuildApp]   File "/opt/tljh/user/lib/python3.7/site-packages/jupyterlab/commands.py", line 669, in build
    raise RuntimeError(msg)

[LabBuildApp] RuntimeError: JupyterLab failed to build

[LabBuildApp] Exiting application: JupyterLab

and further attempts eventually succeeded, after long hang times with "top" showing kswapd0 dominating the CPU, and "free" showing only 60 MB free.

Cheers,

Tim

@daniel214
Copy link

Also, checkout the Installing on Amazon Web Services guide as it's more specific to your use case than the guide to install on your own server.

This guide needs to be updated to match the current reality... especially since it's a step-by-step guide that tends to attract people unfamiliar with the systems. I ended up here after unsuccessfully following this guide several times.

I would at least change this bit and stop recommending the micro tier as something for new users to try!

(For reference, a minimal hub that worked for developing this tutorial used a t2.micro tier, which is free for Amazon users the first year they sign up. Two users were able to concurrently utilize this development hub without issue.)

@manics
Copy link
Member

manics commented Aug 20, 2020

Hi all. We rely on community contributions to help us keep the docs up to date, since not everyone has access to an AWS account for testing. If you're able to open a PR with the correct instructions that would help us a lot!

@thomasroshin
Copy link

thomasroshin commented Feb 4, 2021

I am encountering the same error .

AWS Free-tier - Ubuntu 20.04.1 LTS . When giving user data during create Instance the install hangs .

System log during "hang" shows

[   32.273813] cloud-init[1649]: Checking if TLJH is already installed...
[   32.280924] cloud-init[1649]: Setting up hub environment
         Starting �[0;1;39mPackageKit Daemon�[0m...
[�[0;32m  OK  �[0m] Started �[0;1;39mPackageKit Daemon�[0m.

After a while stopped the EC2 instance and this updated the system log with the error message

[  271.379721] cloud-init[1649]: Setting up user environment...
[  271.386153] cloud-init[1649]: Downloading & setting up user environment...
[  533.810093] Out of memory: Killed process 8703 (python) total-vm:983768kB, anon-rss:725792kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:1556kB oom_score_adj:0
[  533.174715] cloud-init[1649]: Traceback (most recent call last):
[  533.201945] cloud-init[1649]:   File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
[  533.236481] cloud-init[1649]:     return _run_code(code, main_globals, None,
[  533.256037] cloud-init[1649]:   File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
[  533.274421] cloud-init[1649]:     exec(code, run_globals)
[  533.288329] cloud-init[1649]:   File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/installer.py", line 534, in <module>
[  533.301588] cloud-init[1649]:     main()
[  533.317044] cloud-init[1649]:   File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/installer.py", line 506, in main
[  533.341697] cloud-init[1649]:     ensure_user_environment(args.user_requirements_txt_url)
[  533.355690] cloud-init[1649]:   File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/installer.py", line 280, in ensure_user_environment
[  533.377006] cloud-init[1649]:     conda.ensure_conda_packages(USER_ENV_PREFIX, [
[  533.398173] cloud-init[1649]:   File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/conda.py", line 103, in ensure_conda_packages
[  533.413291] cloud-init[1649]:     raw_output = subprocess.check_output(conda_executable + [
[  533.431231] cloud-init[1649]:   File "/usr/lib/python3.8/subprocess.py", line 411, in check_output
[  533.449487] cloud-init[1649]:     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
[  533.465331] cloud-init[1649]:   File "/usr/lib/python3.8/subprocess.py", line 512, in run
[  533.475345] cloud-init[1649]:     raise CalledProcessError(retcode, process.args,
[  533.486151] cloud-init[1649]: subprocess.CalledProcessError: Command '['/opt/tljh/user/bin/python', '-m', 'conda', 'install', '-c', 'conda-forge', '--json', '--prefix', '/opt/tljh/user', 'conda==4.8.1']' died with <Signals.SIGKILL: 9>.

Since earlier comment said multiple re-tries succeeded , manually tried to run

curl -L https://tljh.jupyter.org/bootstrap.py | sudo -E python3 - --admin

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
100 10949  100 10949    0     0   5999      0  0:00:01  0:00:01 --:--:--  5999
Checking if TLJH is already installed...
TLJH already installed, upgrading...
Upgrading TLJH installer...
Upgraded pip
Setup tljh package
Starting TLJH installer...
Setting up admin users
Granting passwordless sudo to JupyterHub admins...
Setting up user environment...
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/installer.py", line 534, in <module>
    main()
  File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/installer.py", line 506, in main
    ensure_user_environment(args.user_requirements_txt_url)
  File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/installer.py", line 280, in ensure_user_environment
    conda.ensure_conda_packages(USER_ENV_PREFIX, [
  File "/opt/tljh/hub/lib/python3.8/site-packages/tljh/conda.py", line 103, in ensure_conda_packages
    raw_output = subprocess.check_output(conda_executable + [
  File "/usr/lib/python3.8/subprocess.py", line 411, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/opt/tljh/user/bin/python', '-m', 'conda', 'install', '-c', 'conda-forge', '--json', '--prefix', '/opt/tljh/user', 'conda==4.8.1']' died with <Signals.SIGKILL: 9>.
ubuntu@ip-172-31-9-38:~/tljh$ curl -L https://tljh.jupyter.org/bootstrap.py | sudo -E python3 - --admin <adminuser>


Even if the minimum configuration is not known (sorry cannot upgrade and test) , at least can you remove the "The AWS free tier is fully capable of running a minimal littlest Jupyterhub for testing purposes." which seems incorrect from multiple user updates.

@twrobinson how were you able to get this working , even after multiple attempts still running into this issue. Any pointers to possibly install without extensions (stripped down version?) without the webpack/npm?

@manics
Copy link
Member

manics commented Feb 4, 2021

Might be worth looking at replacing conda with mamba to see if that reduces the memory requirement?
If there are errors whilst installing JupyterLab plugins this should hopefully go away with the new JupyterLab 3 support for pre-built extensions.

@Tebinski
Copy link

Tebinski commented Feb 9, 2021

with t3-small , the system works fine. But, after you create the instance, we had to write curl -L https://tljh.jupyter.org/bootstrap.py | sudo -E python3 - --admin again to rise the Jupyterhub server.

@consideRatio
Copy link
Member

We have recently updated the memory requirements described in the documentation, as well as made updates that reduces the memory footprint. Since this issue relates to memory it seems, I'll go for a close.

@consideRatio consideRatio added the bug Something isn't working label Oct 25, 2021
@consideRatio consideRatio changed the title Error on installation of TLJH on AWS EC2 Memory restrictions caused failure on TLJH on AWS EC2 Oct 25, 2021
@consideRatio consideRatio changed the title Memory restrictions caused failure on TLJH on AWS EC2 Memory restrictions caused failure TLJH install failure on AWS EC2 Oct 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants