# ESjob class - electronic structure file handling & ePS job creation
12/02/22

Demo for basic electronic structure file IO and creation of ePS jobs based on inputs.

- Currently tested for Gamess and Molden IO only.
- Uses [CCLIB on the backend](http://cclib.github.io/), so should be easily extendable to other CCLIB-supported cases.


## Imports

In [1]:
# Import main package
# import epsman as em

# For ePS job creation with electronic structure handling, use elecStructure.ESjob
from epsman.elecStructure.ESjob import ESjob

In [2]:
# For testing, set the module path and test file.
import inspect
from pathlib import Path

modDir = Path(inspect.getfile(ESjob)).parent
testFilePath = modDir/'fileTest'  # Default module test files, these are included with Github repo.

## Class creation

Basic file handling for Gamess Log files is implemented.

- Setup minimally via ESjob(fileName = electronic structure file).
- If job settings are already set, they will be used, otherwise set full paths here with fileBase = full path.
- This creates an empty base object too, but this can be ignored for electronic structure file handling only.

Further details:

- The file is handled via the an EShandler class object.
- All methods are available as self.esData.method()
- File parsing and conversion is via [CCLIB](https://cclib.github.io), a full list of compatible electronic structure packages & data IO can [be found in the CCLIB docs](https://cclib.github.io/data.html).
- Only [Gamess (US)](https://www.msg.chem.iastate.edu/gamess/) files have been tested so far, which are further converted to [Molden format](https://www.theochem.ru.nl/molden/) for ePS (although recent versions of ePS can also read Gamess files directly).

In [3]:
# Basic file handling for Gamess Log files is implemented.
job = ESjob(fileName = 'xe_SPKrATZP_rel.log', fileBase = testFilePath)

# Note an empty object can also be created.
# job = ESjob()

Set host = None
Set user = None
Set IP = None
Set password = None
Set mol = None
Set orb = None
Set batch = None
Set jobNote = None
Set elecStructure = None
Set genFile = None
Set jobSettings = None
Skipping setJobPaths() until job settings defined, run setJob() to set.
Set elecStructure = xe_SPKrATZP_rel.log

Set input file as /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel.log, use self.setFiles to change.
Set output file as /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel.molden, use self.setMoldenFile to override.

*** Read file /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel.log with CCLIB, data set to self.data.
Read 1 atoms and 68 MOs
*** Set orbPD data to self.orbPD, set group data to self.orbGrps

Found input file Point Group: {'Name': 'DNH', 'NAXIS': '8', 'ORDER': '32'}.
Mapped PGs: Gamess (DNH, 8)  > ePS (DAh) dim mapping.
Found 68 orbitals, in 45 groups.
Found 7 orb symmetries: ['A1g' 'A2u' 'E1u' 'E1g' 'E2g' '

  return bool(asarray(a1 == a2).all())


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,syms,Occ,OccN,degen,OrbGrpOcc,Gamess,ePS
E,iOrbGrp,OrbN,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
-34724.875681,1,1,A1g,True,2,1,2,A1G,SG
-5506.008795,2,2,A1g,True,2,1,2,A1G,SG
-4937.337107,3,3,A2u,True,2,1,2,A2U,A2U
-4937.337107,4,4,E1u,True,2,2,4,E1U,PU
-4937.337107,4,5,E1u,True,2,2,4,E1U,PU
-1169.749415,5,6,A1g,True,2,1,2,A1G,SG
-981.237103,6,7,A2u,True,2,1,2,A2U,A2U
-981.237103,7,8,E1u,True,2,2,4,E1U,PU
-981.237103,7,9,E1u,True,2,2,4,E1U,PU
-700.20336,9,10,E1g,True,2,2,4,E1G,PG


Note the outputs here: 

- `OrbN` is the orbital numbering from the electronic structure file, usually corresponding to doubly-occupied orbitals as given by `OccN`.
- `iOrbGrp` renumbers including degereacies, which matches the internal ordering in ePolyScat. Occupation is given by `OrbGrpOcc`.
- Energies in eV.
- `degen` is the orbital degeneracy.
- `syms` lists the symmetries returned by CCLIB.
- If reading a Gamess file, `Gamess` lists the symmteries in the usual format.
- If a point group was found, `ePS` lists the equivalent ePS symmetry label for the group; the full list can be found at https://epolyscat.droppages.com/SymmetryLabels. Note that this mapping is set by `epsman.sym.convertSyms.convertSymsGamessePS()`, and can be set manually if incorrect. (TO TEST)

Attribs are also stored in the class...

In [4]:
job.elecStructure

'xe_SPKrATZP_rel.log'

In [5]:
# The ESclass.EShandler class is used on the backend, and can be accessed as .esData
job.esData

<epsman.elecStructure.ESclass.EShandler at 0x7f49d0f9f8d0>

In [6]:
# Point group set
job.esData.PG

'DAh'

In [7]:
# Point group info - this lists the members with ePolyScat labels
job.esData.PGinfo

{'ePSlabels': ['SG',
  'A2G',
  'B1G',
  'B2G',
  'PG',
  'DG',
  'FG',
  'GG',
  'SU',
  'A2U',
  'B1U',
  'B2U',
  'PU',
  'DU',
  'FU',
  'GU'],
 'ePSnote': ['D', 'infinity', 'h']}

In [8]:
# Full symmetry mapping details - this lists input symmetries and mapping to ePS labels (currently only Gamess > ePS mapping supported)
job.esData.orbPD.attrs['PGmap']

Unnamed: 0,Gamess,ePS,GDims
1,A1G,SG,13.0
2,A1U,SU,0.0
3,A2G,A2G,0.0
4,A2U,A2U,9.0
5,B1G,B1G,0.0
6,B1U,B1U,0.0
7,B2G,B2G,0.0
8,B2U,B2U,0.0
9,E1G,PG,5.0
10,E1U,PU,9.0


In [9]:
# CCLIB data object is also accessible as .data
job.esData.data

<cclib.parser.data.ccData_optdone_bool at 0x7f49d0eebb10>

### Update/change master file

Just set a new file name and/or path.

This may also be required if the `ESjob` class is initialised without a file set.

In [10]:
# If the file doesn't exist an error will be printed.
job.setMasterESfile(fileName = 'xe_SPKrATZP_rel_copy.log')

Set elecStructure = xe_SPKrATZP_rel_copy.log

Set input file as /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel_copy.log, use self.setFiles to change.
Set output file as /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel_copy.molden, use self.setMoldenFile to override.
Could not import `openbabel`, fallback mechanism might not work.

*** Read file /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel_copy.log with CCLIB, data set to self.data.
*** Error: File /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel_copy.log not found or empty.


## Molden file creation demo

- This is set by `self.checkLocalESfiles()` 
   - This will check for a Molden version of the file and sync to host (if set).
- A call to `self.esData.writeMoldenFile2006()` directly will write a new Molden file from the Gamess .log file.

In [11]:
# If a Gamess file is set, the molden version will be checked & can be created if missing
# For testing use <github repo>/tests (set in .gitignore)
testPath = modDir.parent.parent/'tests'
testPath.mkdir(exist_ok=True)

# Reset master file to use for testing
job.setMasterESfile(fileName = 'xe_SPKrATZP_rel.log')

Set elecStructure = xe_SPKrATZP_rel.log

Set input file as /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel.log, use self.setFiles to change.
Set output file as /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel.molden, use self.setMoldenFile to override.

*** Read file /home/eps/github/epsman/epsman/elecStructure/fileTest/xe_SPKrATZP_rel.log with CCLIB, data set to self.data.
Read 1 atoms and 68 MOs
*** Set orbPD data to self.orbPD, set group data to self.orbGrps

Found input file Point Group: {'Name': 'DNH', 'NAXIS': '8', 'ORDER': '32'}.
Mapped PGs: Gamess (DNH, 8)  > ePS (DAh) dim mapping.
Found 68 orbitals, in 45 groups.
Found 7 orb symmetries: ['A1g' 'A2u' 'E1u' 'E1g' 'E2g' 'E3u' 'E2u']
Assigned 54 electrons to 27 orbitals/19 orbital groups.

Occupied orbitals table:


  return bool(asarray(a1 == a2).all())


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,syms,Occ,OccN,degen,OrbGrpOcc,Gamess,ePS
E,iOrbGrp,OrbN,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
-34724.875681,1,1,A1g,True,2,1,2,A1G,SG
-5506.008795,2,2,A1g,True,2,1,2,A1G,SG
-4937.337107,3,3,A2u,True,2,1,2,A2U,A2U
-4937.337107,4,4,E1u,True,2,2,4,E1U,PU
-4937.337107,4,5,E1u,True,2,2,4,E1U,PU
-1169.749415,5,6,A1g,True,2,1,2,A1G,SG
-981.237103,6,7,A2u,True,2,1,2,A2U,A2U
-981.237103,7,8,E1u,True,2,2,4,E1U,PU
-981.237103,7,9,E1u,True,2,2,4,E1U,PU
-700.20336,9,10,E1g,True,2,2,4,E1G,PG


In [12]:
# Make a copy & test
import shutil
shutil.copy(job.esData.fullPath.as_posix(), (testPath/job.elecStructure).as_posix())

# Set & read new file - note this does not create Molden file
job.setMasterESfile(fileName = job.elecStructure, fileBase = testPath)

Set elecStructure = xe_SPKrATZP_rel.log

Set input file as /home/eps/github/epsman/tests/xe_SPKrATZP_rel.log, use self.setFiles to change.
Set output file as /home/eps/github/epsman/tests/xe_SPKrATZP_rel.molden, use self.setMoldenFile to override.

*** Read file /home/eps/github/epsman/tests/xe_SPKrATZP_rel.log with CCLIB, data set to self.data.
Read 1 atoms and 68 MOs
*** Set orbPD data to self.orbPD, set group data to self.orbGrps

Found input file Point Group: {'Name': 'DNH', 'NAXIS': '8', 'ORDER': '32'}.
Mapped PGs: Gamess (DNH, 8)  > ePS (DAh) dim mapping.
Found 68 orbitals, in 45 groups.
Found 7 orb symmetries: ['A1g' 'A2u' 'E1u' 'E1g' 'E2g' 'E3u' 'E2u']
Assigned 54 electrons to 27 orbitals/19 orbital groups.

Occupied orbitals table:


  return bool(asarray(a1 == a2).all())


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,syms,Occ,OccN,degen,OrbGrpOcc,Gamess,ePS
E,iOrbGrp,OrbN,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
-34724.875681,1,1,A1g,True,2,1,2,A1G,SG
-5506.008795,2,2,A1g,True,2,1,2,A1G,SG
-4937.337107,3,3,A2u,True,2,1,2,A2U,A2U
-4937.337107,4,4,E1u,True,2,2,4,E1U,PU
-4937.337107,4,5,E1u,True,2,2,4,E1U,PU
-1169.749415,5,6,A1g,True,2,1,2,A1G,SG
-981.237103,6,7,A2u,True,2,1,2,A2U,A2U
-981.237103,7,8,E1u,True,2,2,4,E1U,PU
-981.237103,7,9,E1u,True,2,2,4,E1U,PU
-700.20336,9,10,E1g,True,2,2,4,E1G,PG


In [13]:
# self.checkLocalESfiles() will check for Molden file and also sync to host (if set)
job.checkLocalESfiles()

Written Molden2006 format file /home/eps/github/epsman/tests/xe_SPKrATZP_rel.molden
Set elecStructure = xe_SPKrATZP_rel.molden

*** Can't sync files, self.host or self.hostDefn[host][elecFile] not set.


In [14]:
# self.esData.write* methods can also be called
job.esData.writeMoldenFile2006()

Written Molden2006 format file /home/eps/github/epsman/tests/xe_SPKrATZP_rel.molden


## Create ePolyScat input from Gamess source

The main routine `self.buildePSjob` will attempt to execute all the build steps and, hopefully, produce useful output if it fails.

Minimally this needs an ionization channel defined (this is set according to the `iOrbGrp` numbers defined in the tables above), which will allow local job creation. However, if a host machine is not set, the final steps will fail.

In [15]:
# Test with host not set...
job.buildePSjob(channel = 18)

*** Set ionization from orbital/channel 18.
Updated orb table...

Occupied orbitals by group:


Unnamed: 0_level_0,E,OrbN,syms,Occ,OccN,degen,OrbGrpOcc,Gamess,ePS,OrbGrpOccFinal
iOrbGrp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
1,-34724.875681,1,A1g,True,2,1,2,A1G,SG,2
2,-5506.008795,2,A1g,True,2,1,2,A1G,SG,2
3,-4937.337107,3,A2u,True,2,1,2,A2U,A2U,2
4,-4937.337107,4,E1u,True,2,2,4,E1U,PU,4
5,-1169.749415,6,A1g,True,2,1,2,A1G,SG,2
6,-981.237103,7,A2u,True,2,1,2,A2U,A2U,2
7,-981.237103,8,E1u,True,2,2,4,E1U,PU,4
9,-700.20336,10,E1g,True,2,2,4,E1G,PG,4
8,-700.20336,12,A1g,True,2,1,2,A1G,SG,2
10,-700.20336,13,E2g,True,2,2,4,E2G,DG,4



*** Building job test, orb18 (A2U/DAh), batch: test
Set mol = test
Set orb = orb18_A2U
Set batch = test
Generator file set: test.orb18_A2U.conf

*** Job paths set in self.hostDefn['None']:

Set self.ePSglobals for global job settings.
{'LMax': 30}
self.symList not set, running for defaults (all symmetry species).

*** Failed to build job, can't write gen file or create job tree - is the host set?
None


False

In this example the build fails at file IO, since there is no host or paths defined. The ePSrecords generated can be found in two dictionaries, defining the calculation parameters and molecule settings. (For more on these settings, see the [ePolyScat manual and sample jobs.](https://epolyscat.droppages.com/))

In [19]:
# Global calculation settings
job.esData.ePSglobals

{'LMax': 30}

In [20]:
# Job settings
job.esData.ePSrecords

OrderedDict([('elecStructure',
              PosixPath('/home/eps/github/epsman/tests/xe_SPKrATZP_rel.molden')),
             ('IP', 12.422),
             ('OrbOccInit', '2 2 2 4 2 2 4 4 2 4 2 2 4 2 4 4 2 2 4'),
             ('OrbOccTarget', '2 2 2 4 2 2 4 4 2 4 2 2 4 2 4 4 2 1 4'),
             ('InitSpinDeg', 1),
             ('TargSpinDeg', 2),
             ('SpinDeg', 1),
             ('InitSym', 'SG'),
             ('TargSym', 'A2U'),
             ('Ssym',
              '(SG SG SG SG SG SG SG SG SG SG SG SG SG SG SG SG A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G A2G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B1G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G B2G PG PG PG PG PG PG PG PG PG PG PG PG PG PG PG PG DG DG DG DG DG DG DG DG DG DG DG DG DG DG DG DG FG FG FG FG FG FG FG FG FG FG FG FG FG FG FG FG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG SU SU SU SU SU SU SU SU SU SU SU SU SU SU SU SU A2U A2U A2U A2U A2U A2U A2U A2U A2U A2

**Note that the default case sets all possible scattering and continuum symmetries for the point group set, which is not usually what one wants... setting self.symList directly will bypass the automatic setting.**

To override this manually, `self.esData.genSymList()` can be used to generate pairs, or simply pass a list to `job.esData.symList` directly.

In [26]:
# symList holds all (Scattering, Continuum) pairs as tuples.
# This can be set from lists with genSymList

# TODO: fix self.setSyms for this!
# TODO: add Ssym,Csym options to buildePSjob() method.

job.esData.genSymList(Ssym=['SG','A2G'],Csym=['SG','DG'])
job.esData.symList

[('SG', 'SG'), ('SG', 'DG'), ('A2G', 'SG'), ('A2G', 'DG')]

## Versions

In [16]:
import scooby
scooby.Report(additional=['epsman', 'fabric', 'cclib'])

0,1,2,3,4,5
Sat Feb 12 14:23:42 2022 EST,Sat Feb 12 14:23:42 2022 EST,Sat Feb 12 14:23:42 2022 EST,Sat Feb 12 14:23:42 2022 EST,Sat Feb 12 14:23:42 2022 EST,Sat Feb 12 14:23:42 2022 EST
OS,Linux,CPU(s),64,Machine,x86_64
Architecture,64bit,Environment,Jupyter,,
"Python 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]","Python 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]","Python 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]","Python 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]","Python 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]","Python 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]"
epsman,0.0.1,fabric,2.6.0,cclib,1.7
numpy,1.19.2,scipy,1.6.1,IPython,7.21.0
matplotlib,3.3.4,scooby,0.5.6,,
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications,Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications,Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications,Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications,Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications,Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications


In [17]:
# Check current Git commit for local ePSproc version
from pathlib import Path
import epsman as em

!git -C {Path(em.__file__).parent} branch
!git -C {Path(em.__file__).parent} log --format="%H" -n 1

  master[m
* [32mrestructure160221[m
2cdcf0ebb27837e540886652f591c6a0246dc46b


In [18]:
# Check current remote commits
!git ls-remote --heads git://github.com/phockett/epsman

21b4357a169baf9fa7887c68bd1cf8f92c59642c	refs/heads/master
2cdcf0ebb27837e540886652f591c6a0246dc46b	refs/heads/restructure160221
