Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleLoadTest Class #189

Closed
shahzebsiddiqui opened this issue Feb 26, 2020 · 5 comments
Closed

ModuleLoadTest Class #189

shahzebsiddiqui opened this issue Feb 26, 2020 · 5 comments
Assignees

Comments

@shahzebsiddiqui
Copy link
Member

Now that we have a Spider() class we can implement the ModuleLoad class which should provide the following feature

  • Test one or more module trees
  • Allow for module purge and module --force purge before each test
  • Filter test by module names ( analogous to buildtest module list --filter-include "GCC" )
  • Debug mode for troubleshooting
  • Test in sub-shell (default behavior) vs login shell buildtest module loadtest --login

We may also want to ensure MODULEPATH is set to tree specified in ModuleLoadTest class.

The purpose of this feature is to provide SysAdmin or SoftwareAdministrator to validate all their modules in their stack with a single command. This feature is already present in buildtest, but the intent is to make this an API and refactor existing code. Users can use the API to customize for their site on how they want to test their modules.

Currently, buildtest only provides a command line method. This works really well now that we have a Spider class we can get all modules from a tree and then test each one.

Ultimately this would replace module_load_test method which is quite ugly https://github.com/HPC-buildtest/buildtest-framework/blob/devel/buildtest/tools/modules.py#L231

@vsoch
Copy link
Collaborator

vsoch commented Feb 26, 2020

Would module load be a command associated with ModuleCommand ? (discussion here #188)

@vsoch
Copy link
Collaborator

vsoch commented Feb 26, 2020

I'm generally thinking that a class should represent an actual wrapper for something in the real world (e.g. the module command) and then different states / functions can be represented with that class (e.g., module load). So my question is - why is module load fundamentally different / distinct from the (current) Module/ModuleCommand?

@shahzebsiddiqui
Copy link
Member Author

shahzebsiddiqui commented Feb 26, 2020

I understand your reasoning. The ModuleCommand is going to be used for most if not all module operations that happen with the actual module command such as module [purge] [load] | [unload] [save] [describe] [restore].

What i had in mind is the YAML schema can have some key maybe moduleload see https://github.com/HPC-buildtest/buildtest-framework/wiki/Yaml-Schema-Draft that can be used for specifying which module to specify when building test. This could be a user collection, a module key (e.x building for all versions of GCC).

Since Module a.k.a ModuleCommand takes arbitrary module it makes perfect sense to use Module().test_modules() to verify modules can be loaded as part of test creation. If user are allowed to specify this in YAML we can directly inject it in. Or if user specifies a collection name, we can test it via Module().test_collection() before generating the test.

The ModuleLoadTest address a high-level concept of testing an entire module-tree (a.k.a Software Stack). Since Spider can provide a list of all modules we can use ModuleLoadTest to test every software. For software stack administrator it is important to test all modules are working, having one command to automate a test for all modules is a big win. Considering that sites have 1k+ or even 10k+ modules this command will be useful with a tool like Jenkins to run daily.

The way i see ModuleLoadTest working is the following

# test everything, use MODULEPATH as the value for all module trees
a = ModuleLoadTest()

# test software tree /opt/apps/easybuild and /opt/apps/spack
b = ModuleLoadTest("/opt/apps/easybuild:/opt/apps/spack")

# test /opt/apps/spack and force purge modules, and run each test in login shell
c = ModuleLoadTest("/opt/apps/spack", force=True, login=True)

# test all modules defined in MODULEPATH in login shell and stop after 50 modules. 
d = ModuleLoadTest(login=True, count=50)

# test all module with name GCC, OpenMPI, and LAPACK, this includes all versions are tested
e = ModuleLoadTest(key=["GCC","OpenMPI", "LAPACK")

# include specific modules versions to test from module tree /opt/apps/easybuild
f = ModuleLoadTest("/opt/apps/easybuild", include-list=["GCCcore/6.4.0", "OpenMPI/3.0.1"])

# exclude specific modules versions from test /opt/apps/spack
g = ModuleLoadTest("/opt/apps/spack", exclude-list=["GCC/5.4.0", "CUDA/10.1"])

This is just an idea of how i think ModuleLoadTest can be used. I foresee each argument to __init__ method can be exposed on command line buildtest module loadtest and configuration file.

The include-list and exclude-list are version specific that can be useful with filtering modules or excluding modules that are in development or bound to break. For instance some modules like VASP, MATLAB, GAUSSIAN are sometimes configured to be loaded by a specific user group because they paid for software and dont want others to use it due to license seats restriction. Sometimes even site admin cant load the module if some modules are only accesible by a unix group access. A full listing of modulefile functions in lua are covered in https://lmod.readthedocs.io/en/latest/050_lua_modulefiles.html

The ModuleLoadTest is meant for site administrator to test software stack and not meant for users to run. I had a use-case when I was managing the software stack with easybuild + spack + manual modules and user were reporting some broken modules and decided to automate this module load test.

Hope that makes sense.

@shahzebsiddiqui shahzebsiddiqui self-assigned this Feb 28, 2020
@shahzebsiddiqui
Copy link
Member Author

@vsoch I will take this task I have a good idea how this can work. I will do this after #188 is done.

@shahzebsiddiqui
Copy link
Member Author

closing this ticket I have captured this in buildtesters/lmodule#1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants