Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sub-command category #600

Closed
wants to merge 7 commits into from
Closed

Add sub-command category #600

wants to merge 7 commits into from

Conversation

d-kirsten
Copy link
Contributor

The commit adds the sub-command category | cat which works similar to overview but instead lists modules under the categories they are part of and looks at every module reported by Cache.

As Robert mentioned on the mailing list I used the Category entry that is being provided by whatis("Category: Tools, Develop").

Missing are translations for the help page that should be used in src/lmod.in.lua line 168.

Please let me know if I should make changes to the code.

@ocaisa
Copy link

ocaisa commented Sep 8, 2022

Does/could this support multiple categories, e.g., module cat Tools AI?

@d-kirsten
Copy link
Contributor Author

It currently just collects every category it comes across and displays them all. Additionally it is case-sensitive and splits on ",". Therefore "Develop" and "develop" would show as two categories.

I assume it should be possible to only display those explicitly requested. Should this be case-insensitive and allow module category dev to display categories such as "Develop" and/or "Java Dev"?

Unfortunately I will be unavailable until 2022-09-20.

@ocaisa
Copy link

ocaisa commented Sep 12, 2022

I'm just imagining how a user might use such a sub-command to navigate a particularly dense software tree;

  • It's hard to imagine that a user would be aware of (local) nomenclature for categories, so they'd need a way to query this. Also, how would a site support describing their nomenclature?
  • Then they could ask for the software in particular categories (but what to do in the case of a hierarchy where the software is not immediately loadable?)
  • It is probably also interesting to allow a reverse search to query other software in the domain of a particular package. A user could ask "what categories does GROMACS belong to?", and then see what other software is available in those categories. One could maybe even allow this in one step: module cat GROMACS would use the GROMACS categories to limit the displayed software.
  • What should be done with extensions? Packages like R and Python have thousands of these, with a good chance they touch almost every category.
  • Do you stop at one level of categories or allow subcategories? How would you decide on (and store) the ontology?
  • I wonder if this is something that should be described in the modules themselves, or supported via a configuration file to allow for more dynamic changes (and easy collaboration between sites). Such a file could allow for any number of levels of categories and include descriptions of categories

I don't expect you to actually implement this, I'm just imagining how we could usefully leverage this in EasyBuild/EESSI. In our scenarios, there are very large numbers of software packages and only a small subset are actually of real interest to any particular user.

EDIT: I always get confused by what the appropriate term is when talking about these things: nomenclature, ontology, taxonomy? After a bit of digging (which is not my first time), I think what I mean here is most accurately represented by a taxonomy

@d-kirsten
Copy link
Contributor Author

I currently made the following changes:

  • module category will only display the categories, no modules.
└─╼ ml cat

To get a list of every module in a category execute:
   $ module category Foo

------------------------------------ List of Categories -------------------------------------
CFD
Develop
IO
Material

  • module category s1 s2 ... will display every category which match at least one of the strings, and display all modules in them.
└─╼ ml cat E iO

To learn more about a package and how to load it execute:
   $ module spider Bar

------------------------------------------ Develop ------------------------------------------
Java (1)

-------------------------------------------- IO ---------------------------------------------
CGNS (2)

----------------------------------------- Material ------------------------------------------
GROMACS (1)

@ocaisa

It's hard to imagine that a user would be aware of (local) nomenclature for categories, so they'd need a way to query this. Also, how would a site support describing their nomenclature?

For the first part: Would module category as shown above work for this?

For the second part: Could you give me an example of what you mean with "describing their nomenclature"? Like a brief explanation of what the category contains?

Could this be done within a category hook or msgHook? I describe this a bit more at the end.

Then they could ask for the software in particular categories (but what to do in the case of a hierarchy where the software is not immediately loadable?)

I added a disclaimer mentioning to use module spider Bar. Is this sufficient?

It is probably also interesting to allow a reverse search to query other software in the domain of a particular package. A user could ask "what categories does GROMACS belong to?", and then see what other software is available in those categories. One could maybe even allow this in one step: `module cat GROMACS`` would use the GROMACS categories to limit the displayed software.

I imagine this may lead to confusion when there are categories matching that string e.g. Python category and module. Maybe this could be done with an option to module? Something like module --reverse category GROMACS.

But wouldn't that require the exact module name? In that case the user can get the categories with module whatis GROMACS. Though the reverse query would be more explicit in its intend I think.

What should be done with extensions? Packages like R and Python have thousands of these, with a good chance they touch almost every category.

Do you stop at one level of categories or allow subcategories? How would you decide on (and store) the ontology?

I wonder if this is something that should be described in the modules themselves, or supported via a configuration file to allow for more dynamic changes (and easy collaboration between sites). Such a file could allow for any number of levels of categories and include descriptions of categories

Maybe a category hook would be helpful?

One suggestion:

  • If there are no search strings then it receives a table of categories
{ Develop, CFD, }

The table it returns will then be displayed one entry per line. This way you could provide descriptions. If the entries themselves are tables then we could also use ColumnTable.

  • If there are search strings then the hook could receive every category and their modules with the number of occurences (this is how it currently collects everything):
{
   Develop = { git = 4, Fortran = 2, Python = 1, },
   CFD = { ANSYS = 3, OpenFOAM = 5, },
}

and it should return the categories and modules that are supposed to be displayed via ColumnTable and Banner. Doesn't have to be any existing category:

{
   Develop = { Python = 1, },
   "Python: Develop" = { python-pip = 1, },
}

If nothing is received then it will do its own matching based on the search strings.

@ocaisa
Copy link

ocaisa commented Sep 22, 2022

It's hard to imagine that a user would be aware of (local) nomenclature for categories, so they'd need a way to query this. Also, how would a site support describing their nomenclature?

For the first part: Would module category as shown above work for this?

Yes, I think this is a good approach.

For the second part: Could you give me an example of what you mean with "describing their nomenclature"? Like a brief explanation of what the category contains?

Could this be done within a category hook or msgHook? I describe this a bit more at the end.

Exactly, a brief explanation of what the category means. I like the idea of being able to use a hook to inject this, that seems more sustainable.

Then they could ask for the software in particular categories (but what to do in the case of a hierarchy where the software is not immediately loadable?)

I added a disclaimer mentioning to use module spider Bar. Is this sufficient?

Yes, perfect.

It is probably also interesting to allow a reverse search to query other software in the domain of a particular package. A user could ask "what categories does GROMACS belong to?", and then see what other software is available in those categories. One could maybe even allow this in one step: `module cat GROMACS`` would use the GROMACS categories to limit the displayed software.

I imagine this may lead to confusion when there are categories matching that string e.g. Python category and module. Maybe this could be done with an option to module? Something like module --reverse category GROMACS.

But wouldn't that require the exact module name? In that case the user can get the categories with module whatis GROMACS. Though the reverse query would be more explicit in its intend I think.

I can see the issues, best to just forget this in the first implementation.

What should be done with extensions? Packages like R and Python have thousands of these, with a good chance they touch almost every category.

Do you stop at one level of categories or allow subcategories? How would you decide on (and store) the ontology?

I wonder if this is something that should be described in the modules themselves, or supported via a configuration file to allow for more dynamic changes (and easy collaboration between sites). Such a file could allow for any number of levels of categories and include descriptions of categories

Maybe a category hook would be helpful?

Yes, I think a hook would be really useful, it keeps the main implementation simple but can still allow for additional complexity. You can then use the hook as you described for descriptions, and it could also be able to map subcategories. For example, if develop has subcategories foo and bar, it could automatically do the mapping

module category develop ---> module category develop foo bar

That way you can build up any number of levels.

@d-kirsten
Copy link
Contributor Author

Maybe a category hook would be helpful?

Yes, I think a hook would be really useful, it keeps the main implementation simple but can still allow for additional complexity. You can then use the hook as you described for descriptions, and it could also be able to map subcategories. For example, if develop has subcategories foo and bar, it could automatically do the mapping

module category develop ---> module category develop foo bar

That way you can build up any number of levels.

With my suggestion for a hook and in case of search strings, the hook could return:

{
   Develop = { git = 4, Fortran = 2, Python = 1, },
   Foo = { ANSYS = 3, OpenFOAM = 5, },
   Bar = { git = 4, },
}

How could a description be added in here? Allow the return of a second optional table? And then add that text below the banner?

{
   Develop = { git = 4, Fortran = 2, Python = 1, },
   Foo = { ANSYS = 3, OpenFOAM = 5, },
   Bar = { git = 4, },
}

{
   Develop = "This is a description",
   Bar = "This has been added as a subcategory to Develop",
}

@ocaisa
Copy link

ocaisa commented Sep 22, 2022

Sorry, my comment was confusing. I meant one part of the hook that acts on the table of categories { Develop, CFD, } to display the description (and perhaps order the subcategories), and a second part that acts on the second more complex format (i.e., the hook does different things in different modes).

Seriously though, you've done more than enough to satisfy me, perfection is the enemy of progress.

@d-kirsten
Copy link
Contributor Author

I have added the hook now. Similar to the msgHook it receives the value "simple" or "complex" and a table.

In the simple case it should return an array:

{ Develop, CFD, }

and in the complex case the structure is:

{
   Develop = { git = 4, Fortran = 2, Python = 1, },
   CFD = { ANSYS = 3, OpenFOAM = 5, },
}

A mini example for a hook could be:

function cat_hook(kind, t)
   if (kind == "simple") then
      return { "My Category" }
   end

   if (kind == "complex") then
      local pargs = masterTbl().pargs
      local a = {}

      for _, v in ipairs(pargs) do
         if (v == "Develop") then
            a[v] = t[v]
            a.IO = t.IO
         end
      end

      return a
   end
end
hook.register("category", cat_hook)

@rtmclay
Copy link
Member

rtmclay commented Oct 11, 2022

I have started to look at this. It would be very helpful if you could provide example modulefiles and whatever else is needed. If you could use the bugReport script that I could run as an example would greatly speed up this being integrated in the production Lmod.

@d-kirsten
Copy link
Contributor Author

The following patch is a minimal example. If you prefer I could add a few more modules so that more rows are visible when using the search for a category. gcc should have a (1) and (2) to show that each modulefile needs the category mentioned in whatis()

diff --git a/bugReport/bug_report_template.sh b/bugReport/bug_report_template.sh
index f9fd9c6e..6b2ad74e 100755
--- a/bugReport/bug_report_template.sh
+++ b/bugReport/bug_report_template.sh
@@ -8,8 +8,9 @@ export MODULEPATH=$PWD/my_modules/Core

 # Put whatever module commands you need to show your issue
 # Modify the modules in my_modules/Core if necessary if you are using the software hierarchy
-module load gcc mpich
-module av
+module category
+echo "######## Separation"
+module category dev IO



diff --git a/bugReport/my_modules/Core/Python/3.9.6.lua b/bugReport/my_modules/Core/Python/3.9.6.lua
new file mode 100644
index 00000000..f0e58bc9
--- /dev/null
+++ b/bugReport/my_modules/Core/Python/3.9.6.lua
@@ -0,0 +1 @@
+whatis("Category: Develop, Math")
diff --git a/bugReport/my_modules/Core/gcc/10.0.lua b/bugReport/my_modules/Core/gcc/10.0.lua
index 479ece7f..647d5f1d 100644
--- a/bugReport/my_modules/Core/gcc/10.0.lua
+++ b/bugReport/my_modules/Core/gcc/10.0.lua
@@ -3,3 +3,4 @@ prepend_path("MODULEPATH",pathJoin(MODULEPATH_ROOT,"Compiler/gcc/10"))



+whatis("Category: Develop,IO, App")
diff --git a/bugReport/my_modules/Core/gcc/9.0.lua b/bugReport/my_modules/Core/gcc/9.0.lua
new file mode 100644
index 00000000..e9cb00bc
--- /dev/null
+++ b/bugReport/my_modules/Core/gcc/9.0.lua
@@ -0,0 +1 @@
+whatis("Category: App,IO")

rtmclay pushed a commit that referenced this pull request Oct 12, 2022
rtmclay pushed a commit that referenced this pull request Oct 12, 2022
@rtmclay
Copy link
Member

rtmclay commented Oct 12, 2022

Thank you @d-kirsten and @ocaisa, for creating this PR and providing feedback. This is will be a very useful feature.

I have merged in this PR into a new branch called "600_category". @d-kirsten @ocaisa Please test out this new feature on this branch. Also if @boegel and @wpoely86 get the chance, please test this feature out and provide feedback. Thanks!

a[#a+1] = [[
To learn more about a package and how to load it execute:
$ module spider Bar
]]
Copy link

@ocaisa ocaisa Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
]]
where "Bar" is the name of a module.
]]

This is more like how it is done for module keyword

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe package instead of module here since that is the word you used previously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the line with the word package as a new commit

src/cmdfuncs.lua Show resolved Hide resolved
@ocaisa
Copy link

ocaisa commented Oct 14, 2022

@rtmclay I tested this on a flat and hierarchical naming scheme and it worked as expected. I also played around a bit with hooks to affect how the information is displayed and this also seems to work (though I will need to learn some more Lua!).

@d-kirsten The one thing that was not clear to me was whether I can use a hook to automatically search subcategories. For example if I know Bar is a subcategory of Foo, can I have a hook that when I execute

module cat Foo

the table that gets created is actually

module cat Foo Bar

Or perhaps that the modules that fall under Bar are also included under Foo (if Bar is not listed on the command line)? I believe the complex table will allow me to do this, I should be able to just add the contents of one entry to the other, but my lua is not good enough to figure out how to do this (and I also wonder if there will be double-counting if a module has both the main and the sub category).

@rtmclay
Copy link
Member

rtmclay commented Oct 14, 2022

AFAIK, Lmod has no notion of a sub category. So what do you mean as that something is a sub-category?

@ocaisa
Copy link

ocaisa commented Oct 14, 2022

I thought I might be able to define subcategories myself via a hook. As far as Lmod is concerned everything is a category, but when I present things I can organise things according to my scheme.

Lmod would see categories Foo and Bar, but in my scheme Bar is a subcategory of Foo. Then I write a hook to get output like

[ocaisa@gpunode1 lmod_hook]$ module cat

To get a list of every module in a category execute:
   $ module category Foo
      
-------------------------------------------------------------------------------- List of Categories --------------------------------------------------------------------------------
Foo: This is a category
  Bar: This is a subcategory

and then if Python is in category Foo and SciPy is in category Bar, if I do

module cat Foo

I actually get

------------------------------------------------------------------------------------- Foo --------------------------------------------------------------------------------------
Python (1)   SciPy (1)

which is (somewhat) similar to the output of module cat Foo Bar

@ocaisa
Copy link

ocaisa commented Oct 14, 2022

Ah, I got it to work with

   if (kind == "complex") then
      local pargs = masterTbl().pargs
      local a = {}

      for _, v in ipairs(pargs) do
         a[v] = t[v]
         if (v == "Develop") then
            for key, value in pairs(t["Tools"]) do
               a[v][key] = value
            end
         end
      end

      return a
   end

which gives

[ocaisa@gpunode1 lmod_hook]$ module cat develop

To learn more about a package and how to load it execute:
   $ module spider Bar
      
------------------------------------------------------------------------------------- Develop --------------------------------------------------------------------------------------
scikit-build (1)


[ocaisa@gpunode1 lmod_hook]$ module cat Develop

To learn more about a package and how to load it execute:
   $ module spider Bar
      
------------------------------------------------------------------------------------- Develop --------------------------------------------------------------------------------------
FFTW (1)   scikit-build (1)

(as the hook is currently case sensitive).

rtmclay and others added 2 commits October 14, 2022 13:17
Co-authored-by: ocaisa <alan.ocais@cecam.org>
@d-kirsten
Copy link
Contributor Author

I have merged in this PR into a new branch called "600_category". @d-kirsten @ocaisa Please test out this new feature on this branch.

It works for me. @ocaisa does the change for the simple case still allow you to use descriptions the way you wanted to? With the recent change it formats it into columns: module cat

Ah, I got it to work with

   if (kind == "complex") then
      local pargs = masterTbl().pargs
      local a = {}

      for _, v in ipairs(pargs) do
         a[v] = t[v]
         if (v == "Develop") then
            for key, value in pairs(t["Tools"]) do
               a[v][key] = value
            end
         end
      end

      return a
   end

Instead of

for key, value in pairs(t["Tools"]) do
   a[v][key] = value
end

you could also do

a.Tools = t.Tools

which instead of adding modules it would display the whole Tools category separately. Though your example would require the search term to be an exact category.

Maybe something like this (hopefully it works) is what you might want:

   if (kind == "complex") then
      local pargs = masterTbl().pargs
      local a = {}

      local extra = {
         Develop = { "CFD", "IO" },
         CFD     = { "Material", },
      }

      for _, arg in ipairs(pargs) do
         local term = arg:caseIndependent()

         -- Add every matching category
         for cat, v in pairs(t) do
            if (cat:find(term)) then
               a[cat] = v
            end
         end

         -- Add additional categories based on search.
         for match, add in pairs(extra) do
            if (match:find(term)) then
               for _, new in pairs(add) do
                  a[new] = t[new]
               end
            end
         end
      end

      return a
   end

@ocaisa
Copy link

ocaisa commented Oct 17, 2022

@d-kirsten Yes, descriptions was ok. I quickly had a naive implementation:

   if (kind == "simple") then
      local a = {}

      for i, v in ipairs(t) do
         if (v == "Develop") then
            a[i] = v .. ": This is develop\n"
         else
            a[i] = v .. ": Blank\n"
         end
      end

      return a
   end

which looks like

[ocaisa@gpunode1 hier]$ module cat

To get a list of every module in a category execute:
   $ module category Foo
      
-------------------------------------------------------------------------------- List of Categories --------------------------------------------------------------------------------
Develop: This is develop
  Tools: Blank

Lmod seems to be pretty smart in how it presents that, with a bit more experience I think I can display it as I like.

I tweaked the above hook for the simple case.

function valid(data, array)
 local valid = {}
 for i = 1, #array do
  valid[array[i]] = true
 end
 if valid[data] then
  return true
 else
  return false
 end
end
function cat_hook(kind, t)
   if (kind == "simple") then
      local a = {}

      for i, v in ipairs(t) do
         if (v == "Develop") then
            a[i] = v .. ": This is develop\n"
            if valid("Tools", t) then
               a[i] = a[i] .. "\\__" .. "Tools" .. ": This is tools\n"
            end
         elseif not valid(v, {"Tools"}) then
            a[i] = v .. ": (No description available)\n"
         end
      end

      return a
   end

which displays ordered subcategories

[ocaisa@gpunode1 lmod_hook]$ module cat

To get a list of every module in a category execute:
   $ module category Foo
      
----------------------------------------------------------------------- List of Categories -----------------------------------------------------------------------
Develop: This is develop
\__Tools: This is tools

The hook is pretty verbose, but with a little thought I think you could process a json file with the taxonomy and simplify things.

@rtmclay For me, I think everything I would like to see is there and working.

@rtmclay
Copy link
Member

rtmclay commented Oct 19, 2022

Note that I have modified "module cat" to use "ColumnTable" instead of a straight list. Please test the latest version of the 600_category branch to see if it works for you.

@ocaisa
Copy link

ocaisa commented Oct 20, 2022

@rtmclay Everything seems ok to me, my hooks worked unchanged:

[ocaisa@gpunode1 hier]$ module cat

To get a list of every module in a category execute:
   $ module category Foo
      
-------------------------------------------------------------------------------- List of Categories --------------------------------------------------------------------------------
Develop  Tools


[ocaisa@gpunode1 hier]$ export LMOD_PACKAGE_PATH=$PWD/lmod_hook
[ocaisa@gpunode1 hier]$ module cat

To get a list of every module in a category execute:
   $ module category Foo
      
-------------------------------------------------------------------------------- List of Categories --------------------------------------------------------------------------------
Develop: This is develop
\__Tools: This is tools


[ocaisa@gpunode1 hier]$ module cat develop

To learn more about a package and how to load it execute:
   $ module spider Bar
      
------------------------------------------------------------------------------------- Develop --------------------------------------------------------------------------------------
scikit-build (1)

-------------------------------------------------------------------------------------- Tools ---------------------------------------------------------------------------------------
FFTW (1)   scikit-build (1)


[ocaisa@gpunode1 hier]$ module --version

Modules based on Lua: Version 8.7.13 (8.7.13-33-g570e3719) 2022-09-14 12:56 -05:00
    by Robert McLay mclay@tacc.utexas.edu

@rtmclay
Copy link
Member

rtmclay commented Oct 20, 2022

Great! I'm merge in this branch onto the main branch and release a new version next week.

@ocaisa
Copy link

ocaisa commented Oct 31, 2022

@rtmclay I was just wondering when that release will land? I'd like to hold back a release of https://github.com/EESSI/gentoo-overlay until we can include it

@rtmclay
Copy link
Member

rtmclay commented Oct 31, 2022

Soon. I am trying to release this fix and Issue #604 and I have run into problems. I'll have it fixed by Monday 11/7 or sooner.

@rtmclay
Copy link
Member

rtmclay commented Nov 1, 2022

This branch has been merged into Lmod 8.7.14. In a week this will become 8.8. Please test Lmod 8.7.14 if you get the chance.

@rtmclay rtmclay closed this Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants