StringiFor, Strings Fortran Manipulator, yet another strings Fortran module
A KISS pure Fortran library providing astrings (class) manipulator for modern (2003+) Fortran projects.
- StringiFor is a pure Fortran (KISS) library providing a strings manipulator for modern Fortran projects;
- StringiFor is Fortran 2003+ standard compliant;
- StringiFor is OOP designed;
- StringiFor is TDD designed;
- StringiFor is a Free, Open Source Project.
GNU partial support
GNU gfortran does not support user-defined-type-IO, thus some class features are disabled if GNU is used.
What is StringiFor?
Modern Fortran standards (2003+) have introduced a better support for characters variables, but Fortraners still do not have the power on dealing with strings of other more-rich-programmers, e.g. Pythoners. Allocatable deferred length character variables are now quantum-leap with respect the old inflexible Fortran characters, but it is still not enough for many Fortraners. Moreover, Fortran does not provide builtin methods for widely used strings manipulations offered by other languages, e.g. UPPER/lowercase transformation, tokenization, etc... StringiFor attempts to fill this lack.
Go to Top
StringiFor exposes only one class (OO-designed), the
string type, that should be used as a more powerful string variable with respect a standard Fortran
character variable. The main features of this class are:
- seamless interchangeability with standard character variables, e.g. concatenation, IO, etc...;
- handy builtin methods, e.g. split, search, basename, join, etc...;
- low memory consumption: only one deferred length allocatable character member is stored, allowing for efficient memory allocation in array of strings, the elements of which can have different lengths;
- safe: almost all methods are elemental or pure;
- robust: the library is Test Driven Developed TDD, a comprehensive tests suite is provided.
Any feature request is welcome.
Go to Top
A Taste of StringiFor
StringiFor is very handy...
string IO is overloaded by defined write/read TBP. Moreover, dedicated methods and operators can be exploited for IO, e.g.
use stringifor type(string) :: astring astring = 'Hello World' print "(A)", astring%chars() ! "chars" method returns a standard character variable print "(DT)", astring ! defined IO is not enabled with GNU gfortran print "(A)", astring//'' ! on-the-fly conversion to standard character by means of concatenation
string has many methods for a plethora of strings manipulations, e.g.
use stringifor type(string) :: astring type(string) :: strings(3) astring = '0123456789' print "(A)", astring%reverse()//'' ! print "9876543210" astring = 'Hello World' print "(A)", astring%replace(old='World', new='People')//'' ! print "Hello People" astring = 'Hello World' strings = astring%partition(sep='lo Wo') print "(A)", 'Before sep: "'//strings(1)//'"' ! print "Hel" print "(A)", 'Sep itself: "'//strings(2)//'"' ! print "lo Wo" print "(A)", 'After sep: "'//strings(3)//'"' ! print "rld" strings(1) = 'one' strings(2) = 'two' strings(3) = 'three' print "(A)", astring%join(strings)//'' ! print "onetwothree" print "(A)", astring%join(strings, sep='-')//'' ! print "one-two-three" astring = ' a StraNgE caSe var' print "(A)", astring%camelcase()//'' ! print " AStrangeCaseVar" print "(A)", astring%snakecase()//'' ! print " a_strange_case_var" print "(A)", astring%startcase()//'' ! print " A Strange Case Var"
StringiFor, by means of the portability environment library, PENF can handle numbers (reals and integers) effortless. The string/number casting (to/from and viceversa) is done by overloaded assignments (for all kinds of integers and reals). For convenience, StringiFor exposes the PENF number portable kind parameters.
use stringifor type(string) :: astring astring = 127 _I1P ! "I1P" is the PENF kind for 1-byte-like integer. print "(A)", astring//'' ! print "+127" astring = 3.021e6_R4P ! "R4P" is the PENF kind for 4-byte-like real. print "(A)", astring//'' ! print "+0.302100E+07" astring = "3.4e9" ! assign to a string without the necessity to define a real kind if (astring%is_number()) then if (astring%is_real()) then print "(E13.6)", astring%to_number(kind=1._R4P) ! print " 0.340000E+10" using a 4-byte-like kind endif endif
StingiFor is developed to improve the poor Fortran people with daily strings-usage, however, also complex scenario is taken into account, e.g. file parsing, OS operations, etc...
use stringifor type(string) :: astring ! OS like manipulation astring = '/bar/foo.tar.bz2' print "(A)", astring%basedir()//'' ! print "/bar" print "(A)", astring%basename()//'' ! print "foo.tar.bz2" print "(A)", astring%basename(extension='.tar')//'' ! print "foo" print "(A)", astring%basename(last_extension=.true.)//'' ! print "foo.tar" ! XML like tag parsing astring = '<test> <first> hello </first> <first> not the first </first> </test>' print "(A)", astring%search(tag_start='<first>', tag_end='</first>')//'' ! print "<first> hello </first>"
A naive CSV parser
This is just a provocation, but with StringiFor it is easy to develop a naive CSV parser. Let us assume we want to parse a cars-price database as the following one
Year, Make, Model, Description, Price 1997, Ford, E350 , ac abs moon, 3000.00 1999, Chevy, Venture "Extended Edition" , , 4900.00 1999, Chevy, Venture "Extended Edition Very Large", , 5000.00
Well, parsing it and handling its cells values is very easy by means of StringiFor
use stringifor implicit none type(string) :: csv !< The CSV file as a single stream. type(string), allocatable :: rows(:) !< The CSV table rows. type(string), allocatable :: columns(:) !< The CSV table columns. type(string), allocatable :: cells(:,:) !< The CSV table cells. type(string) :: most_expensive !< The most expensive car. real(R8P) :: highest_cost !< The highest cost. integer :: rows_number !< The CSV file rows number. integer :: columns_number !< The CSV file columns number. integer :: r !< Counter. ! parsing the just created CSV file: all done 9 statements! call csv%read_file(file='cars.csv') ! read the CSV file as a single stream call csv%split(tokens=rows, sep=new_line('a')) ! get the CSV file rows rows_number = size(rows, dim=1) ! get the CSV file rows number columns_number = rows(1)%count(',') + 1 ! get the CSV file columns number allocate(cells(1:columns_number, 1:rows_number)) ! allocate the CSV file cells do r=1, rows_number ! parse all cells call rows(r)%split(tokens=columns, sep=',') ! get current columns cells(1:columns_number, r) = columns ! save current columns into cells enddo ! now you can do whatever with your parsed data ! print the table in markdown syntax print "(A)", 'A markdown-formatted table' print "(A)", '' print "(A)", '|'//csv%join(array=cells(:, 1), sep='|')//'|' columns = '----' ! re-use columns for printing separators print "(A)", '|'//csv%join(array=columns, sep='|')//'|' do r=2, rows_number print "(A)", '|'//csv%join(array=cells(:, r), sep='|')//'|' enddo print "(A)", '' ! find the most expensive car print "(A)", 'Searching for the most expensive car' most_expensive = 'unknown' highest_cost = -1._R8P do r=2, rows_number if (cells(5, r)%to_number(kind=1._R8P)>=highest_cost) then highest_cost = cells(5, r)%to_number(kind=1._R8P) most_expensive = csv%join(array=[cells(2, r), cells(3, r)], sep=' ') endif enddo print "(A)", 'The most expensive car is : '//most_expensive
See the test program csv_naive_parser for a working example.
Obviously, this is a naive parser without any robustness, but it proves the usefulness of the StringiFor approach.
Go to Top
StringiFor is an open source project, it is distributed under a multi-licensing system:
Anyone is interest to use, to develop or to contribute to StringiFor is welcome, feel free to select the license that best matches your soul!
More details can be found on wiki.
Go to Top
StringiFor home is at https://github.com/szaghi/StringiFor. It uses
git submodule to handle the third party dependencies. To download all the source files you can:
- clone this repository (all dependencies are satisfied):
git clone https://github.com/szaghi/StringiFor
git submodule update --init
- download only the StringiFor sources, all other dependencies must be downloaded manually:
- download the latest master-branch archive:
git submodule update --init
- download a release archive at https://github.com/szaghi/StringiFor/releases
- download the latest master-branch archive:
Third Party dependencies
Currently StringiFor depends on:
If you download a release of StringiFor manually (without git) you must download manually the above dependencies and place them into
src/third_party sub-directory of the project root-tree.
Go to Top
StringiFor is a modern Fortran project thus a modern Fortran compiler is need to compile the project. In the following table the support for some widely-used Fortran compilers is summarized.
|Compiler Vendor Support||Notes|
|does not support defined IO|
The library is modular, namely it exploits Fortran modules. As a consequence, there is compilation-cascade hierarchy to build the library. To correctly build the library the following approaches are supported
- Build by means of FoBiS: full support;
- Build by means of GNU Make: support only static-linked library building (not shared) for both Intel Fortran and GNU gfortran;
- Build by means of CMake: to be implemented.
The FoBiS building support is the most complete, as it is the one used for the developing StringiFor.
Build by means of FoBiS
fobos file is provided to build the library by means of the Fortran Building System FoBiS.
Build all tests
After (a successuful) building a directory
./exe is created containing all the compiled tests that constitute the StringiFor regression-tests-suite, e.g.
→ FoBiS.py build Builder options Directories Building directory: "exe" Compiled-objects .o directory: "exe/obj" Compiled-objects .mod directory: "exe/mod" Compiler options Vendor: "gnu" Compiler command: "gfortran" Module directory switch: "-J" Compiling flags: "-c -frealloc-lhs -std=f2008 -fall-intrinsics -O2 -Dr16p" Linking flags: "-O2" Preprocessing flags: "-Dr16p" Coverage: False Profile: False PreForM.py used: False PreForM.py output directory: None PreForM.py extensions processed:  Building src/tests/is_real.f90 Compiling src/lib/penf.F90 serially Compiling src/lib/string_t.F90 serially Compiling src/lib/stringifor.F90 serially Compiling src/tests/is_real.f90 serially Linking exe/is_real Target src/tests/is_real.f90 has been successfully built Builder options Directories Building directory: "exe" Compiled-objects .o directory: "exe/obj" Compiled-objects .mod directory: "exe/mod" Compiler options Vendor: "gnu" Compiler command: "gfortran" Module directory switch: "-J" Compiling flags: "-c -frealloc-lhs -std=f2008 -fall-intrinsics -O2 -Dr16p" Linking flags: "-O2" Preprocessing flags: "-Dr16p" Coverage: False Profile: False PreForM.py used: False PreForM.py output directory: None PreForM.py extensions processed:  Building src/tests/slen.f90 Compiling src/tests/slen.f90 serially ... → tree -L 1 exe/ exe/ ├── assignments ├── basename_dir ├── camelcase ├── capitalize ├── concatenation ├── equal ├── escape ├── extension ├── fill ... ├── swapcase ├── to_number ├── unique └── upper_lower
Build the library
# static-linked library by means of GNU gfortran FoBiS.py build -mode stringifor-static-gnu # shared-linked library by means of GNU gfortran FoBiS.py build -mode stringifor-shared-gnu # static-linked library by means of Intel Fortran FoBiS.py build -mode stringifor-static-intel # shared-linked library by means of Intel Fortran FoBiS.py build -mode stringifor-shared-intel
The library will be built into the directory
List other fobos modes
To list all fobos-provided modes type
→ FoBiS.py build -lmodes The fobos file defines the following modes: - "tests-gnu" - "tests-gnu-debug" - "tests-intel" - "tests-intel-debug" - "stringifor-static-gnu" - "stringifor-shared-gnu" - "stringifor-static-intel" - "stringifor-shared-intel"
It is worth to note that the first mode is the one automatically called by
Build by means of GNU Make
The provided makefile support only static-linked library building (not shared one) with both Intel Fortran Compiler and GNU gfortran, and it has two main building rules:
- build the (static linked) library;
- build the tests suite.
the GNU gfortran compiler is the default one, but the compiler used can be customized with COMPILER=#vendor switch.
To build the library type with the GNU gfortran compiler.
The library will be built into the directory
To build the tests suite type
The tests will be built into the directory
If you want to use Intel Fortran Compiler add the switch
COMPILER=intel to the above commands, i.e.
make COMPILER=intel # build only the library make COMPILER=intel TESTS=yes # build the tests suite
Build by means of CMake
To be done.
Go to Top
The StringiFor documentation is mainly contained into this file (it has its own wiki with some less important documents). Detailed documentation of the API is contained into the GitHub Pages that can also be created locally by means of ford tool.
In the following all the methods of
string are listed with a brief description of their aim. The hyperlinks bring you to the full API explained into the GH pages.
|basedir||return the base directory name of a string containing a file name|
|basename||return the base file name of a string containing a file name|
|camelcase||return a string with all words capitalized without spaces|
|capitalize||return a string with its first character capitalized and the rest lowercased|
|chars||return the raw characters data|
|escape||escape backslashes (or custom escape character)|
|extension||return the extension of a string containing a file name|
|fill||pad string on the left (or right) with zeros (or other char) to fill width|
|free||free dynamic memory|
|insert||insert substring into string at a specified position|
|join||return a string that is a join of an array of strings or characters|
|lower||return a string with all lowercase characters|
|partition||split string at separator and return the 3 parts (before the separator and after)|
|read_file||read a file a single string stream|
|read_line||read line (record) from a connected unit|
|read_lines||read (all) lines (records) from a connected unit as a single ascii stream|
|replace||return a string with all occurrences of substring old replaced by new|
|reverse||return a reversed string|
|search||search for tagged record into string|
|slice||return the raw characters data sliced|
|snakecase||return a string with all words lowercase separated by _|
|split||return a list of substring in the string using sep as the delimiter string|
|startcase||return a string with all words capitalized, e.g. title case|
|strip||return a string with the leading and trailing characters removed|
|swapcase||return a string with uppercase chars converted to lowercase and vice versa|
|tempname||return a safe temporary name suitable for temporary file or directories|
|to_number||cast string to number|
|unescape||unescape double backslashes (or custom escaped character)|
|unique||reduce to one (unique) multiple occurrences of a substring into a string|
|upper||return a string with all uppercase characters|
|write_file||write a single string stream into file|
|write_line||write line (record) to a connected unit|
|write_lines||write lines (records) to a connected unit|
|end_with||return true if a string ends with a specified suffix|
|is_allocated||return true if the string is allocated|
|is_digit||return true if all characters in the string are digits|
|is_integer||return true if the string contains an integer|
|is_lower||return true if all characters in the string are lowercase|
|is_number||return true if the string contains a number (real or integer)|
|is_real||return true if the string contains an real|
|is_upper||return true if all characters in the string are uppercase|
|start_with||return true if a string starts with a specified prefix|
|assignment||assignment of string from different inputs|
|//||concatenation resulting in characters for seamless integration|
|.cat.||concatenation resulting in
|/=||not equal operator|
|<||lower than operator|
|<=||lower equal than operator|
|>=||greater equal than operator|
|>||greater than operator|
Go to Top
Comparison to other Approaches
The lack of Fortran support for strings manipulation has promoted different solutions in the past years. Following the classification of Clive Page  we can consider:
- standard character type;
- deferred-length allocatable character type (standard 2003+);
VARYING_STRINGtype (standard 90/95+) as defined in ISO/IEC 1539-2:2000 (Varying length character strings).
Let us compare StringiFor to the previous three approaches. In particular, let us consider Ian Harvey extension of
VARYING_STRING, i.e. the
Clive Page had pointed out the following issues, among the others:
- fixed (at compile time) string length
character(len=3) :: astring ! further lengths different from 3 are not allowed
- silent truncation on assignment
character(len=3) :: astring astring = 'abcdefgh' ! silent trunctation at 'abc'
- trim-cluttered code
character(len=99) :: astring character(len=99) :: anotherstring astring = 'abcdefgh' anotherstring = trim(astring)//'ilmnopqrst' ! trim-cluttering is a necessity
- handle significant trailing spaces
character(len=99) :: astring character(len=99) :: anotherstring astring = 'Hello ' ! for some reasons you want to keep these trailing white spaces anotherstring = trim(astring)//'World' ! you need trim because ! len(astring)==len(anotherstring), but lost the significant ! trailing spaces...
- different character definition
character :: astring*10 ! old way character(len=10) :: anotherstring ! new way
- allocation of array of strings
character(len=10), allocatable :: astring(:) allocate(astring(100)) ! all 100 elements of the array have 10 characters, ! different lengths cannot be declared
- initialization of array of strings
! the following is illegal character(len=9), parameter :: day(7) = ['Monday', & 'Tuesday', & 'Wednesday', & 'Thursday', & 'Friday', & 'Saturday', & 'Sunday'] ! the following is legal, but cluttered by non significant trailing spaces character(len=9), parameter :: day(7) = ['Monday ', & 'Tuesday ', & 'Wednesday', & 'Thursday ', & 'Friday ', & 'Saturday ', & 'Sunday']
- IO limitations for non standard character variables
character(len=99) :: astring character(len=:), allocatable :: anotherstring type(varying_string) :: yetanotherstring ! fully-simple support for standard character variables astring = 'abcdefgh' print*, astring print "(A)", astring read(10, *) astring ! partial-simple support for standard deferred length-length allocatable character variables ! care must be placed in input operation... print*, anotherstring print "(A)", anotherstring read(10, *) anotherstring ! support depends on the implementation of the varying string type print*, yetanotherstring print "(DT)", yetanotherstring read(10, *) yetanotherstring
- substring notation (slice) for non standard character variables
character(len=99) :: astring character(len=:), allocatable :: anotherstring type(varying_string) :: yetanotherstring astring = 'abcdefgh' yetanotherstring = astring anotherstring = astring(2:6) ! allowed anotherstring = yetanotherstring(2:6) ! not allowed
- passing string to procedures expecting standard character argument is complicated
Analyzing the above issues we can agree that deferred-length allocatable character and aniso_varyng_string approaches address many of them, at the cost of introducing some oddies.
deferred-length allocatable character
This approaches addresses all the issues related to the fixed length limitation, e.g.
character(len=:), allocatable :: astring character(len=:), allocatable :: anotherstring astring = 'Hello ' anotherstring = astring//'World' ! trailing with spaces of astring correctly handled ! no need of trim
However, it has some limitations too. Aside the input operation, the most important (IMHO) are related to arrays of strings handling, e.g.
character(len=:), allocatable :: asetofstring(:) allocate(character(len=99) :: asetofstring(10)) ! all 10 elements must have len=99
Aniso_varying_string is an implemention of ISO/IEC 1539-2:2000 (Varying length character strings) developed by Ian Harvey that is internally based on a deferred-lenght allocatable character variable: it is essentially a derived type wrapping a deferred-lenght allocatable character. As a consequence, it has all the advantages of the deferred-length allocatable character approach. The wrapping approach addresses the arrays related issues, e.g.
type(varying_string), allocatable :: asetofstring(:) allocate(asetofstring(10)) ! all 10 elements can have diffent lengths
Its major issues are related to IO operations: however, this is addressed by new Fortran support for defined IO for derived type that make more effortless the IO of such an object. The other main issue is the impossibility to use the standard slice notation to access to substring: aniso_varying_string addresses (partially) this issue by public-exposing the wrapped allocatable character of its implementations thus allowing the slicing of it, e.g.
type(varying_string) :: astring astring = 'abcdefg' print "(A)", astring%chars(2:3) ! print 'bc'
StringiFor shares the same philosophy of aniso_varying_string, thut it has the same pros and cons. However, StringiFor is an Object Oriented Designed class, thus it has some peculiariaties distinguishing it from aniso_varying_string, see StringiFor Peculiarities.
The following table summarizes the comparison analysis.
|issue||standard character||deferred-length allocatable character||aniso_varying_string||StringiFor|
|significant trailing spaces|
|different string definition|
|substring (slice) notation|
|bad or no support|
StringiFor publics an OOD class, the
string object. This class is aimed to address all the issues of the standard character type, as ISO Varying String approaches do, but it is also designed to provide a features-rich string object as you can find on other languages like Python. As a matter of facts, the auxiliary methods added to the
string object consitute a long list of new (for Fortraners) string-facilities, allowing you to handle strings effortless (cases-conversion, files-handling, encode/decode, numbers-casting, etc...), see the complete API. It is worth to note that StringiFor is a tentative to adopt an fully OOD thus all methods and operators are TBP defined: to use StringiFor you can import only the
string type, allowing a sane and robust names space handling. Only in the case you want the Fortran builtins to accept a
string instead of a standard character type, e.g. to use
index(astring, 'c') seamless with both a
type(string) :: astring and a
character(99) :: astring, you must use all the StringiFor public objects, including the overloaded interfaces of the Fortran builtins.
 Improved String-handling in Fortran, Clive Page, October 2015.
 aniso_varying_string, Ian Harvey, 2016.
Go to Top