Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GLPK solver reading #187

Merged
merged 12 commits into from
Aug 18, 2023
Merged

Add GLPK solver reading #187

merged 12 commits into from
Aug 18, 2023

Conversation

trevorb1
Copy link
Member

Description

In this PR I add a the class ReadGlpk(...) which allows a user to process GLPK solution files. Associated unit tests have also been added. This PR addresses one of the comments in the JOSS review about being able to use only a single solver with otoole.

Issue Ticket Number

Closes #19

Documentation

A sample use case has been added to the examples page on the documentation site. Notably, the user must use the --wglp and --write flags when creating the *.lp and solution files

glpsol -m model.txt -d data.txt --wglp model.lp --write model.sol

Then when processing the solution file with otoole, the user must also pass in the new optional flag --glpk_model added to otoole.

otoole results glpk csv model.sol results config.yaml --glpk_model model.lp

As elaborated in issue #19, and the GLPK wiki, while GLPK can write out a human readable format file (using the --output flag), it is suggested to use the --wglp and --write flags for machine readable files. Both these files are needed to extract out all solution values.

Questions

  1. Does this process of adding in the optional --glpk_model flag to otoole make sense? I couldn't really come up with a better way to do this. I don't know if this is making it too confusing though?
  2. The new ReadGlpk() class inherits from the ReadResultsCBC() to use its functionality to convert between wide and long formatted dataframes. This means that now ReadCbc(), ReadGurobi(), and ReadGlpk() all inherit from the ReadResultsCBC() class. The name ReadResultsCBC() may be a little outdated now. Should we change it to something more general like ConvertReadResults() or something like that, since its main function is to convert wide data to long data?

@trevorb1 trevorb1 requested a review from willu47 August 12, 2023 02:10
@trevorb1
Copy link
Member Author

Also, I have no idea what is up with this typing issue! Does anyone have any thoughts, else I will look at it again next week :)

@willu47
Copy link
Member

willu47 commented Aug 15, 2023

Hi @trevorb1. The typing issue is raised by the static type checker mypy. You can use it as part of the pre-commit configuration bundled in the otoole package (perhaps you already are). The typing error raised is

src/otoole/results/results.py:453: error: Incompatible types in assignment (expression has type "float", target has type "str")
src/otoole/results/results.py:461: error: Incompatible return value type (got "Tuple[Dict[str, str], Any]", expected "Tuple[Dict[str, Union[str, float]], Any]")

Fix by adding a type hint inline to line 433 of src/otoole/results/results.py

        status: Dict[str, Union[str, float]] = {}

Copy link
Member

@willu47 willu47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @trevorb1. Nice work. It would be handy to access the GLPK solution, using otoole, although it's probably of limited overall utility given that glpsol is only used for relatively small models. Still it's a useful exercise.

I think the current implementation is okay. Performance-wise, looping over the lines of the text file is not so efficient. It's worth using pandas' read_csv function as it's much faster than parsing each line in a loop. It is also worth looking at the CBC reading of linopy for inspiration.

@trevorb1
Copy link
Member Author

trevorb1 commented Aug 16, 2023

Thanks so much for the feedback, @willu47! I guess I always thought directly reading in a text/csv file was more efficient then using pandas (since it uses built in datatypes) at the expense of being less useful to work with the data. So it is good to know read_csv() doesn't necessarily compromise performance.

I updated the ReadGlpk logic to use read_csv() as you suggested! The reading of the model file was fairly straightforward. The reading of the solution file was a little more awkward, as the number of columns would change if I just parse on whitespace characters. This would cause pandas to throw an error. There is the on_bad_lines argument to pass into read_csv(), but this would just drop the lines causing errors altogether.

I landed on reading everything in all at once, then parsing out the header information from the solution information via slicing and splitting the strings. Alternatively, we could read in the csv in parts (similar to shown below) to have read_csv() do the parsing for us. But this seemed more cumbersome as then we will have to type check for file_path if it is a string or TextIO object (which isnt shown in the code below).

def read_solution(self, file_path: Union[str, TextIO]) -> Tuple[Dict[str, Union[str, float]], pd.DataFrame]:
    # get status information
    df_status = pd.read_csv(file_path, header=None, nrows=7, sep=":", index_col=0)
    status["name"] = df_status.loc["c Problem", 1].strip()
    status["status"] = df_status.loc["c Status", 1].strip()
    status["objective"] = float(df_status.loc["c Objective", 1].split()[2])
    file_path.seek(0) # would need to type check this 
    
    # get solution infromation
    df = pd.read_csv(
        file_path,
        header=None,
        skiprows=[x for x in range(0, 8)],
        sep=r"\s+",
        names=["ID", "NUM", "STATUS", "PRIM", "DUAL"],
    )

    return status, df

I also slightly changed the logic of ReadGlpk to require the model file upon instantiation. It seemed pointless to read in a GLPK solution file without the model file since it doesn't have any information about what the variables are.

Please let me know if you have any other thoughts! Else I think we should be good to merge :)

@willu47
Copy link
Member

willu47 commented Aug 17, 2023

Hi @trevorb1 - given the tweak I made to the otoole results cli and the addition of the read_results function in #180, it would be worth updating this branch to reflect those changes, and merging this after that PR.

@willu47
Copy link
Member

willu47 commented Aug 18, 2023

Regarding renaming of classes, how about changing ReadResultsCBC to ReadWideResults?

@trevorb1
Copy link
Member Author

trevorb1 commented Aug 18, 2023

Hi @willu47, thanks for the pointers! All functionality to close this issue has been implemented, including the integration of reading GLPK with the Python API update. So I am going to go ahead and merge this.

Possibly a new issue will be how the glpk_model argument is passed into the getter function for the read result strategy. This is needed as the model file is required to instantiate the ReadGlpk class. However, Im not sure if this clutters the function (see below).

If you think refactoring how this works is needed, please just let me know or create a new issue and assign me to it. Thanks!

def _get_read_result_strategy(
user_config, from_format, glpk_model=None
) -> Union[ReadResults, None]:
"""Get ``ReadResults`` for gurobi, cbc, cplex, and glpk formats
Arguments
---------
config : dict
User configuration describing parameters and sets
from_format : str
Available options are 'cbc', 'gurobi', 'cplex', and 'glpk'
glpk_model : str
Path to ``*.glp`` model file
Returns
-------
ReadStrategy or None
A ReadStrategy object. Returns None if from_format is not recognised
"""
if from_format == "cbc":
read_strategy: ReadResults = ReadCbc(user_config)
elif from_format == "gurobi":
read_strategy = ReadGurobi(user_config=user_config)
elif from_format == "cplex":
read_strategy = ReadCplex(user_config=user_config)
elif from_format == "glpk":
if not glpk_model:
raise OtooleError(resource="Read GLPK", message="Provide glpk model file")
read_strategy = ReadGlpk(user_config=user_config, glpk_model=glpk_model)
else:
return None
return read_strategy

@trevorb1 trevorb1 merged commit 5a194de into OSeMOSYS:master Aug 18, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Programmatically process GLPK solution file
2 participants