# Porting Barbieri / Van Hove `phshift2007` programs

This notebook contains working notes explaining the initial process of porting the Barbieri / Van Hove [phshift2007](https://www.icts.hkbu.edu.hk/VanHove_files/leed/phshift2007.zip) package into a python wrapped libary for inclusion under
`phaseshifts.lib.libphsh`.

The notebook assumes the user is running on a unix platform with tools such as `wget`, `unzip` and (GNU) `grep` installed.

# Important Notes

Permission was sought by the original `phaseshifts` package author Liam Deacon whilst working at Diamond Light Source Ltd., and kindly granted by [Professor Michel Van Hove](mailto://vanhove@cityu.edu.hk), to use the `phshift2007` code hosted on Professor Michel Van Hove's [LEED Calculation Home Page](https://www.icts.hkbu.edu.hk/VanHove_files/leed/leedpack.html) for inclusion in this python package.

Out of respect to the original phshift2007 authors, please cite their work when using this package for academic purposes, e.g.

```
A. Barbieri and M. A. VanHove, Phase Shift Package (2007), www.icts.hkbu.edu.hk/vanhove/VanHove files/phshift2007.zip.
```

## Methodology

1. Download and extract original `phshift2007.zip` archive.
2. Refactor files for clarity.



In [1]:
dirpath = "phaseshifts/lib/phshift2007"
phshift2007_zip = dirpath + ".zip"

! wget -O "$phshift2007_zip" https://www.icts.hkbu.edu.hk/VanHove_files/leed/phshift2007.zip

--2024-01-22 22:29:57--  https://www.icts.hkbu.edu.hk/VanHove_files/leed/phshift2007.zip
Resolving www.icts.hkbu.edu.hk (www.icts.hkbu.edu.hk)... 158.182.169.112
Connecting to www.icts.hkbu.edu.hk (www.icts.hkbu.edu.hk)|158.182.169.112|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 72119 (70K) [application/zip]
Saving to: 'phaseshifts/lib/phshift2007.zip'


2024-01-22 22:29:58 (329 KB/s) - 'phaseshifts/lib/phshift2007.zip' saved [72119/72119]



In [2]:
! unzip -o "$phshift2007_zip" -d "$dirpath"

Archive:  phaseshifts/lib/phshift2007.zip
  inflating: phaseshifts/lib/phshift2007/psprog.ab4  
  inflating: phaseshifts/lib/phshift2007/psprog.ab3  
  inflating: phaseshifts/lib/phshift2007/psprog.ab2  
  inflating: phaseshifts/lib/phshift2007/psprog2007.ab1  
  inflating: phaseshifts/lib/phshift2007/phshift2007.txt  


From the downloaded and extracted zip we should have the following files:

1. `phshift2007.txt` - This is the text user manual, which is included as an [appendix to the phaseshifts documentation](https://phaseshifts.readthedocs.io/en/latest/phshift2007.html).
2. `psprog2007.ab1` - An expanded version of `phshift2007.txt` to include some additional notes and input/output file examples. > **NOTE:** This file does not contain any compilable code, despite having the `ab*` file extension
3. `psprog.ab2` - Contains `ATOMIC.I` output file for Rh(111)-(2x2)-C2H3
4. `psprog.ab3` - Contains a concatenation of `PHSH0.FOR` (A.K.A. the `hartfock` program by Eric Shirley) and `PHSH1.FOR` (A.K.A. the Pendry-Titterington `CAVPOT` program for calculating muffin-tin potentials).
5. `psprog.ab4` - Contains a concatenation of `PHSH2CAV.FOR` (for CAVLEED), `PHSH2WIL.FOR` (A. R. William's phase shift program), `PHSH2REL.FOR` (Relativistic phase shift program) and `PHSH3.FOR` (Continuous Scatting Phase program, aka `CONPHAS`) and is not directly compilable.


In [3]:
! head -10 "$dirpath/psprog"*.ab*

==> phaseshifts/lib/phshift2007/psprog.ab2 <==
C  file PSPROG.AB2 
C  containing only file ATOMIC.I for Rh(111)-(2x2)-C2H3
C---------------------------------------------------------------------
RELA
RELAT. ATOMIC CHARGE DENSITY
0
  .22222222D-05  .11925696D+03 100045.00
   .00001109451
   .00001147384
   .00001186615

==> phaseshifts/lib/phshift2007/psprog.ab3 <==
C  file PSPROG.AB3  Feb. 5, 1995
C  containing programs PHSH0.FOR and PHSH1.FOR
C
C---------------------------------------------------------------------
C  program PHSH0.FOR
C---------------------------------------------------------------------
c
c  there are nr grid points, and distances are in bohr radii...
c
c  r(i)=rmin*(rmax/rmin)**(dfloat(i)/dfloat(nr)) , i=1,2,3,...nr-1,nr

==> phaseshifts/lib/phshift2007/psprog.ab4 <==
C  file PSPROG.AB4  Feb. 5, 1995
C  containing programs PHSH2CAV.FOR, PHSH2WIL.FOR, PHSH2REL.FOR and
C     PHSH3.FOR
C
C---------------------------------------------------------------------
C  program P

Those of you who have read `phshift2007.txt` or `psprog2007.ab1` will notice an inconsistency at this point between the contents of the `phshift2007.zip` and the documentation.

In [4]:
! grep -A 20 "Phase Shift Package: Contents" "$dirpath/phshift2007.txt"

Phase Shift Package: Contents
-----------------------------

The following files should be included with this distribution. If any
are missing please contact Michel Van Hove (vanhove@cityu.edu.hk)
for replacements.

1) PhShift.doc      This file contains this user guide 
                    to use the phase-shifts programs. It should be
                    supplemented with the information contained in
                    the input files provided.
                    Includes definitions of I/O files,
                    contents and basic hints on running the programs.

2) PhSh0.for        These files contain FORTRAN programs that
   PhSh1.for        correspond to the basic steps necessary
   PhSh2rel.for     to obtain the phase shifts needed in a LEED
   PhSh2wil.for     structural determination.
   PhSh2cav.for
   PhSh3.for



## Reformatting / Refactoring psprog.ab[23] programs

This section focuses on extracting the `phsh*` programs from `psprog.ab3` and `psprog.ab4` files.

We begin by defining a helper function for the extraction process.

In [5]:
import os.path

def extract_fortran_programs_from_ab_file(ab_filepath, output_dirpath=None):
    # type: (str, str) -> dict[str, str]
    """Extracts the Fortran programs from the .ab files in the phshift2007.zip archive."""
    with open(ab_filepath, encoding="ascii") as fp_in:
        lines = fp_in.readlines()[3:]  # skip first three comment lines
        prog_marker = "C  program "
        program_start_lines = [i for i, line in enumerate(lines) if line.startswith(prog_marker)]
        programs = {}
        for i, marker in enumerate(program_start_lines):
            prog_name = lines[marker].lstrip(prog_marker).lower().strip("\n").strip("\r")
            filename = os.path.join(output_dirpath or os.path.dirname(ab_filepath), prog_name)
            with open(filename, mode="w", encoding="ascii") as fp_out:
                if i == len(program_start_lines) - 1:
                    fp_out.writelines(lines[marker-1:])
                else:
                    fp_out.writelines(lines[marker-1:program_start_lines[i+1]-1])
            programs[prog_name] = filename
        return programs


### Spitting up `psprog.ab3`

Here we will split `psprog.ab2` into `phsh0` and `phsh1` and compile.


In [6]:
! rm -f "$dirpath/phsh*.for"

print(extract_fortran_programs_from_ab_file(os.path.join(dirpath, "psprog.ab3")))

! head -16 "$dirpath/phsh"[0-1].for

{'phsh0.for': 'phaseshifts/lib/phshift2007/phsh0.for', 'phsh1.for': 'phaseshifts/lib/phshift2007/phsh1.for'}
==> phaseshifts/lib/phshift2007/phsh0.for <==
C---------------------------------------------------------------------
C  program PHSH0.FOR
C---------------------------------------------------------------------
c
c  there are nr grid points, and distances are in bohr radii...
c
c  r(i)=rmin*(rmax/rmin)**(dfloat(i)/dfloat(nr)) , i=1,2,3,...nr-1,nr
c
c
c
c  the orbitals are store in phe(), first index goes 1...nr, the
c  second index is the orbital index (i...nel)
c
c  look at the atomic files after printing this out to see everything...
c
c  suffice it to say, that the charge density at radius r(i)

==> phaseshifts/lib/phshift2007/phsh1.for <==
C---------------------------------------------------------------------
C  program PHSH1.FOR
C---------------------------------------------------------------------
C
C  adated from CAVPOT 
C
      PROGRAM CAVPOT
C  PHASE SHIFT PROGRAM FROM CA

### Compiling `phsh0` and `phsh1`

In [18]:
SRC_DIR="phaseshifts/lib/phshift2007"
OUT_DIR="phaseshifts/lib/phshift2007"
GFORTRAN_FLAGS="-Wall -Wno-unused-label -Wno-tabs -Wno-unused-variable -Wno-unused-dummy-argument -fcheck=bounds -frecursive -std=legacy"

!gfortran $GFORTRAN_FLAGS "$SRC_DIR/phsh0.for" -o "$OUT_DIR/phsh0" 2>&1 | tee "$OUT_DIR/phsh0.logs" && printf "\n\nSUCCESS: " && ls "$OUT_DIR/phsh0"

phaseshifts/lib/phshift2007/phsh0.for:261:9:

  261 |         nj=xnj(i)+xnj(i)
      |                1
phaseshifts/lib/phshift2007/phsh0.for:1473:12:

 1473 |          nden=nr*(log(rden/rmin)/log(rmax/rmin))
      |                   1
phaseshifts/lib/phshift2007/phsh0.for:1220:10:

 1220 |         jrc=1.d0+dfloat(nr-1)*dlog(rcut /rmin)/dlog(rmax/rmin)
      |                 1
phaseshifts/lib/phshift2007/phsh0.for:1223:10:

 1223 |         jrt=1.d0+dfloat(nr-1)*dlog(rtest/rmin)/dlog(rmax/rmin)
      |                 1
phaseshifts/lib/phshift2007/phsh0.for:1101:12:

 1101 |         izuse=dabs(vi(nr-2,1)*r(nr-2))+0.5d0
      |                   1
phaseshifts/lib/phshift2007/phsh0.for:1075:37:

 1075 |      1              nel,nl,nm,no,xnj,rpower,xnum,etot2,iuflag)
      |                                     1
phaseshifts/lib/phshift2007/phsh0.for:1156:47:

 1156 |      1              nr,r,r2,dl,q0,xm1,xm2,njrc,dummy)
      |                                               1


SUCCESS: ph

Note that the compiler is warning about a rank mismatch for a variable called `dummy` used as parameter `vi` in the `rpower` subroutine. This was corrected in the libphsh.f file, however a patch can be applied like so ...

In [17]:
!gfortran $GFORTRAN_FLAGS "$SRC_DIR/phsh1.for" -o "$OUT_DIR/phsh1" && printf "\n\nSUCCESS: " && ls "$OUT_DIR/phsh1"

[01m[Kphaseshifts/lib/phshift2007/phsh1.for:360:72:[m[K

  360 | T(9H &NL16 Z=,f7.4,4H,RT=,f7.4,5H &END)                                             C
      |                                                             [01;35m[K1[m[K

[01m[Kphaseshifts/lib/phshift2007/phsh1.for:28:15:[m[K

   28 |       INDEX(X)=20.0*(ALOG(X)+8.8)+2.0
      |               [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh1.for:27:8:[m[K

   27 |      +4HRELA,4HHERM,4HCLEM,4HPOTE/
      |        [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh1.for:27:15:[m[K

   27 |      +4HRELA,4HHERM,4HCLEM,4HPOTE/
      |               [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh1.for:27:22:[m[K

   27 |      +4HRELA,4HHERM,4HCLEM,4HPOTE/
      |                      [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh1.for:27:29:[m[K

   27 |      +4HRELA,4HHERM,4HCLEM,4HPOTE/
      |                             [01;35m[K1[m[K
[01m[Kphaseshifts/l

Here again we see that we can compile the program, albeit with warnings. This time we have warnings about the use of COMMON block `WK` containing arrays of differing (inconsistent) size to those declared within the subroutines/functions that use them. Furthermore the warning states that the sizes used in some cases is greater than the size of the common block - which suggests serious issues. The obvious solution is to increase the size of the common block and use the same sizes throughout the program.

### Spitting up `psprog.ab4`

Here we will split `psprog.ab4` into `phsh2rel.f`, `phsh2wil.f`, `phsh2cav.f` and `phsh3.f` and compile each.


In [9]:
extract_fortran_programs_from_ab_file(os.path.join(dirpath, "psprog.ab4"))

! head -16 "$dirpath/phsh"[2-3]*.for

==> phaseshifts/lib/phshift2007/phsh2cav.for <==
C---------------------------------------------------------------------
C  program PHSH2CAV.FOR
C---------------------------------------------------------------------
C
C
C  POTENTIAL-TO-PHASE-SHIFT PROGRAM (CAVLEED PACKAGE)
C
C  MAIN PROGRAM FOR CALCULATION OF PHASE SHIFTS USING POTENTIAL
C  ON LOUCKS GRID (E.G. AS SUPPLIED BY THE MUFFIN-TIN POTENTIAL
C  PROGRAM).  ENERGIES INPUT IN HARTREES.
      DIMENSION V(250),RX(250),PHS(20)
      REAL*8 NAME(2),MTZ,delstore(401,15),estore(401)
C
C First input channels
C
      OPEN (UNIT=4,FILE='mufftin.d',STATUS='OLD')

==> phaseshifts/lib/phshift2007/phsh2rel.for <==
C---------------------------------------------------------------------
C  program PHSH2REL.FOR
C---------------------------------------------------------------------
      PROGRAM PHASE
      IMPLICIT DOUBLE PRECISION (A-H,O-Z)
      CHARACTER OPT*3,OPT1*3,OPTS*3,ANAME*2,AN*30,BDATA*28
      CHARACTER SUB*3,RECORD*3,TL*1,SL*1,SS1*6,S

In [10]:
! ls "$dirpath"/*.for

phaseshifts/lib/phshift2007/libphsh.for
phaseshifts/lib/phshift2007/phsh0.for
phaseshifts/lib/phshift2007/phsh1.for
phaseshifts/lib/phshift2007/phsh2cav.for
phaseshifts/lib/phshift2007/phsh2rel.for
phaseshifts/lib/phshift2007/phsh2wil.for
phaseshifts/lib/phshift2007/phsh3.for


In [11]:
! gfortran $GFORTRAN_FLAGS "$dirpath/phsh2cav.for" -o "$dirpath/phsh2cav" && printf "\n\nSUCCESS: " && ls "$dirpath/phsh2cav"

[01m[Kphaseshifts/lib/phshift2007/phsh2cav.for:30:11:[m[K

   30 |       ianz=(emax-emin)/estep +1.01
      |           [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh2cav.for:97:15:[m[K

   97 |       INDEX(X)=20.*(ALOG(X)+8.8)+2.
      |               [01;35m[K1[m[K


SUCCESS: phaseshifts/lib/phshift2007/phsh2cav


Compiler warnings were emitted and were for possible change of intended value when casting from `REAL(4)` to `INTEGER(4)`. Probably this should be investigated and decided whether `round`, `floor` or `ceil` conversion functions would be the most appropriate.

In [12]:
! gfortran $GFORTRAN_FLAGS "$dirpath/phsh2rel.for" -o "$dirpath/phsh2rel" && printf "\n\nSUCCESS: " && ls "$dirpath/phsh2rel"

[01m[Kphaseshifts/lib/phshift2007/phsh2rel.for:67:72:[m[K

   67 |     READ(5,2) (ZP(J),J=1,JRI)                                         C
      |                                                                      [01;35m[K1[m[K

[01m[Kphaseshifts/lib/phshift2007/phsh2rel.for:181:72:[m[K

  181 | MMON /Z/ T                                                      CCCCCCC
      |                                                                [01;35m[K1[m[K

[01m[Kphaseshifts/lib/phshift2007/phsh2rel.for:259:72:[m[K

  259 | 27  UNP=U(NM5)+X20*(T11*(UP(N)+UP(NM4))+T26*UP(NM2)                   -
      |                                                                      [01;35m[K1[m[K

[01m[Kphaseshifts/lib/phshift2007/phsh2rel.for:261:72:[m[K

  261 |     WNP=W(NM5)+X20*(T11*(WP(N)+WP(NM4))+T26*WP(NM2)                   -
      |                                                                      [01;35m[K1[m[K

[01m[Kphaseshifts/lib/phshift2007/phsh2re

In [13]:
! gfortran $GFORTRAN_FLAGS "$dirpath/phsh2wil.for" -o "$dirpath/phsh2wil" && printf "\n\nSUCCESS: " && ls "$dirpath/phsh2wil"

[01m[Kphaseshifts/lib/phshift2007/phsh2wil.for:389:26:[m[K

  389 |       DATA B, C, O, D / 1H , 1H*, 1H0, 1HI /
      |                          [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh2wil.for:389:31:[m[K

  389 |       DATA B, C, O, D / 1H , 1H*, 1H0, 1HI /
      |                               [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh2wil.for:389:36:[m[K

  389 |       DATA B, C, O, D / 1H , 1H*, 1H0, 1HI /
      |                                    [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh2wil.for:389:41:[m[K

  389 |       DATA B, C, O, D / 1H , 1H*, 1H0, 1HI /
      |                                         [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh2wil.for:398:11:[m[K

  398 |       J0 = -Y1 * T + 1.5
      |           [01;35m[K1[m[K
[01m[Kphaseshifts/lib/phshift2007/phsh2wil.for:406:10:[m[K

  406 |       J = T * (Y(I) - Y1) + 1.5
      |          [01;35m[K1[m[K


SUCCESS: phaseshif

A couple of notable warnings were emitted when compiling `PHSH2WIL.FOR`, all conversion warnings from either `HOLLERITH` constants to `REAL(4)` or `REAL(4)` to `INTEGER(4)`

In [14]:
! gfortran $GFORTRAN_FLAGS "$dirpath/phsh3.for" -o "$dirpath/phsh3" && printf "\n\nSUCCESS: " && ls "$dirpath/phsh3"



SUCCESS: phaseshifts/lib/phshift2007/phsh3


No worrying issues were identified when compiling `PHSH3.FOR`

> **TIP**: To produce the above programs quickly, run `make phshift2007` from the project root. Note that the outputs may be slightly different to the above depending on whether/how the program was split based on the `'C program '` lines.

## Converting to a library

At this point we have a full suite of phshift2007 binaries, which we can test I/O against but which are intended to be ran as standalone programs.

Crudely speaking `phaseshifts/lib/libphsh.f` is a concatenation of all of the `phsh*.for` files, however it has had significant additional changes. These could be inspected via a diff tool to get an idea of the types and scale of changes.

In [15]:
! cat "$dirpath/phsh0.for" "$dirpath/phsh1.for" "$dirpath/phsh2cav.for" "$dirpath/phsh2wil.for" "$dirpath/phsh2rel.for" "$dirpath/phsh3.for" | tee "$dirpath/libphsh.for" | grep --line-number 'C  program ' | sed 's/C  program //'

2:PHSH0.FOR
1632:PHSH1.FOR
2741:PHSH2CAV.FOR
2955:PHSH2WIL.FOR
3369:PHSH2REL.FOR
3693:PHSH3.FOR


A few simple operations were applied, namely:

1. Removing trailing whitespace
2. Line continuation characters were changed to `+` for clarity
3. Comment lines were changed from `[Cc]` to `!`

In [16]:
! sed -i 's/[ ]*$//;s/^     [1Cc]/     +/;s/^[Cc]/!/' "$dirpath/libphsh.for"

The program blocks were then replaced into subroutines and I/O adjusted assuming the code would be used as a library rather than standalone executables.

Significant further modifications were made to try to make the program more standards compliant and more compatible with modern fortran (including removing `GOTO` statements where possible and using `ALLOCATABLE` arrays.)