Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: MarSwitching.jl: A Julia package for Markov switching dynamic models #6441

Open
editorialbot opened this issue Mar 4, 2024 · 31 comments
Assignees
Labels
Julia Matlab review TeX Track: 4 (SBCS) Social, Behavioral, and Cognitive Sciences

Comments

@editorialbot
Copy link
Collaborator

editorialbot commented Mar 4, 2024

Submitting author: @m-dadej (Mateusz Dadej)
Repository: https://github.com/m-dadej/MarSwitching.jl
Branch with paper.md (empty if default branch): joss
Version: 0.2.2
Editor: @jbytecode
Reviewers: @y1my1, @mkitti
Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388"><img src="https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388/status.svg)](https://joss.theoj.org/papers/f0b33a8a4b30b3d9f0184dec014eb388)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@y1my1 & @mkitti, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @jbytecode know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @y1my1

📝 Checklist for @mkitti

@editorialbot
Copy link
Collaborator Author

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1137/141000671 is OK
- 10.2307/1912559 is OK
- 10.1142/S0219024902001523 is OK
- 10.1016/j.ijforecast.2014.03.014 is OK
- 10.1016/j.ejrh.2023.101374 is OK
- 10.1016/j.csfx.2021.100059 is OK
- 10.1016/j.joi.2020.101081 is OK
- 10.32614/rj-2019-012 is OK
- 10.25080/majora-92bf1922-011 is OK
- 10.2139/ssrn.1714016 is OK
- 10.2307/1392086 is OK
- 10.1017/S0305004100030401 is OK
- 10.1016/0304-4076(94)90036-1 is OK

MISSING DOIs

- None

INVALID DOIs

- None

@editorialbot
Copy link
Collaborator Author

Software report:

github.com/AlDanial/cloc v 1.90  T=0.04 s (1236.4 files/s, 180186.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
SVG                              9              0              0           2659
Julia                           11            405            322           1111
Markdown                         8            243              0            902
CSV                              2              0              0            669
YAML                             8              5              8            189
TeX                              2             17              0            178
CSS                              1             12              0             76
R                                2             21              9             44
TOML                             2              4              0             29
Python                           1              5              1             26
MATLAB                           1              7              8             24
JSON                             1              4              0             17
-------------------------------------------------------------------------------
SUM:                            48            723            348           5924
-------------------------------------------------------------------------------

Commit count by author:

   185	SquintRook
    38	Mateusz Dadej

@editorialbot
Copy link
Collaborator Author

⚠️ An error happened when generating the pdf. Problem with affiliations for Mateusz Dadej, perhaps the affiliations index need quoting?.

@editorialbot
Copy link
Collaborator Author

Paper file info:

📄 Wordcount for paper.md is 1300

✅ The paper includes a Statement of need section

@editorialbot
Copy link
Collaborator Author

License info:

✅ License found: MIT License (Valid open source OSI approved license)

@jbytecode
Copy link

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

⚠️ An error happened when generating the pdf. Problem with affiliations for Mateusz Dadej, perhaps the affiliations index need quoting?.

@jbytecode
Copy link

@m-dadej - Could you please fix the paper metadata before reviewing process starts? The output of the previous message may help how to indicate the error. Thank you in advance.

@m-dadej
Copy link

m-dadej commented Mar 6, 2024

@editorialbot generate pdf

@editorialbot
Copy link
Collaborator Author

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

@jbytecode
Copy link

@y1my1, @mkitti - Dear reviewers, you can start with creating your task lists. In that list, there are several tasks. Whenever you perform a task, you can check on the corresponding checkbox. Since the review process of JOSS is interactive, you can always interact with the author, the other reviewers, and the editor during the process. You can open issues and pull requests in the target repo. Please mention the url of this page in there so we can keep tracking what is going on out of our world.

Please create your tasklist by typing

@editorialbot generate my checklist

Thank you in advance.

@jbytecode
Copy link

@y1my1, @mkitti After two weeks, may I kindly request our reviewers to begin their review and update their status, please?

@y1my1
Copy link

y1my1 commented Mar 18, 2024

Review checklist for @y1my1

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/m-dadej/MarSwitching.jl?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@m-dadej) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@y1my1
Copy link

y1my1 commented Mar 23, 2024

Overall, this is a good mature package for achieving what the author intends to do. I could tell the author has put a lot of effort into programming the software and writing the documentation. I have some minor issues with the following

Functionality

  • Functionality: Have the functional claims of the software been confirmed?

I was trying to verify whether the package can replicate the results of a basic regression model I found here (data could be found here). However, the results I got from this package seem to be quite different from what I can get from statsmodel (Python) and Stata (statsmodel and Stata gave almost the same results). See attached graph below
image

  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

I tried to compare the performance of the corresponding methods in the package and statsmodel (Python). I generated a simulated data of 10000 rows, following this example. The results don't seem to be that different, at least on my MacBook (see below). However, I may not be doing it the right way; it would be good if the author could document how he did the performance test.
image

Documentation

  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

There are no community guidelines, but this should be an easy fix.

@mkitti
Copy link

mkitti commented Mar 23, 2024

Review checklist for @mkitti

Conflict of interest

  • I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the https://github.com/m-dadej/MarSwitching.jl?
  • License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
  • Contribution and authorship: Has the submitting author (@m-dadej) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
  • Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
  • Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
  • Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
  • State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

@m-dadej
Copy link

m-dadej commented Mar 29, 2024

Thank you for your comments y1my1 and sorry for delayed response, I was busy with my Phd.

Regarding the functionality comment. The way you defined a model with MSModel function won't give you the same one as in statsmodels. Here is a correct command:

model = MSModel(df[2:end, 2], 2, exog_switching_vars = df[begin:end-1, 2], switching_var = false)

exog_vars is for non-switching exogenous variables and as the equation in the example suggests the variance of the model is not regime-dependent (thus I specified switching_var = false). A default value is a regime-dependent variance. I am very much open to suggestion on how to make the function more intuitive to use :)

That being said, a current version of the package still gives some of the estimates a bit off than in statsmodels:

image

The one intercept now is way different. I think some time ago I compared it but disregarded the issue as the rest of the examples matches my package. Plus, the Matlab package MS_Regress also had different estimates.
But anyway, I checked my code and the issue was solved in the most recent commit (m-dadej/MarSwitching.jl@6fa582d). (tl;dr Before the commit, I was adding a small number to state probability to solve an underflow error that have been happening some time ago).

If you install the github version of the package you can now replicate the results from statsmodels.

Regarding the performance comparison. The code for benchmark is in benchmark folder. The readme example can give different results as benchmark (in terms of relative performance), because the model is different (less complex). In the benchmark code I estimate 3 regime, 1 non-switching and 1 switching exogenous variable model.

That being said, I compare the readme example and the relative performance between statsmodel is smaller. MarSwitching is ~2x faster. But I am unable to replicate your results, even though the code on your screen shoot seems to be fine.

About community guidelines, it's a good idea. I will write them soon.

@jbytecode
Copy link

@m-dadej - Do you think the first sentence in the article is a bit too assertive? I believe there are other packages in Julia that do similar or identical tasks. Perhaps if you soften the first sentence and refer to packages that perform the same or similar tasks, it would be better.

@m-dadej
Copy link

m-dadej commented Apr 2, 2024

I have not seen any package that do identical task. There is a package to model hidden Markov models (https://github.com/gdalle/HiddenMarkovModels.jl) which may allow similar estimation. However as you can see in its example, it requires additional programming from user in order to estimate a rather simple Markov switching model. Do you have any other package in mind that I missed?

@jbytecode
Copy link

@m-dadej - Yes, you can refer to the package, emphasizing that it can address the same types of problems with additional effort, while highlighting that your package is capable of efficiently handling these tasks in a smoother manner rather than a sharp one like the current one. Thank you in advance.

@jbytecode
Copy link

@y1my1, @mkitti - May I kindly ask you to update your review status? Thank you in advance.

@mkitti
Copy link

mkitti commented May 5, 2024

@editorialbot generate pdf

@mkitti
Copy link

mkitti commented May 5, 2024

For the examples, an environment including a Project.toml and Manifest.toml should be included with the repository. This would allow user to reproduce the examples with the exact versions of the packages used.

Some of the examples use data included in the package as follows:

df = CSV.read("my_assets/philips.csv", DataFrame, missingstring = "NA")

df = CSV.read("my_assets/df_spx.csv", DataFrame)

I might make those relative to the path of the package that the user installed so that no additional work needs to be done for the user to try the examples. They would otherwise have to clone the repository, but this is not necessary since the files are already installed.

philips_csv_path = joinpath(
    dirname(pathof(MarSwitching)),
    "../docs/src/man/examples/my_assets/philips.csv"
)
df = CSV.read(philips_csv_path, DataFrame, missingstring = "NA")



df_spx_csv_path = joinpath(
    dirname(pathof(MarSwitching)),
    "../docs/src/man/examples/my_assets/df_spx.csv"
)
df = CSV.read(df_spx_csv_path, DataFrame)

@mkitti
Copy link

mkitti commented May 5, 2024

The reference and limitations of HiddenMarkovModels.jl would be greatly appreciated. Users may be trying to decide between the two packages and a comparison of this package that would be very useful.

@mkitti
Copy link

mkitti commented May 5, 2024

I see Community Guidelines in the README.md of the Github repository, but not in the documentation itself.

https://github.com/m-dadej/MarSwitching.jl/blob/main/README.md#contributing

Please include this information in https://m-dadej.github.io/MarSwitching.jl/

@mkitti
Copy link

mkitti commented May 5, 2024

Overall, MarSwitching.jl is an important contribution to the field, allowing Markov switching dynamic models to be simulated and analyzed in the Julia Programming Langauge. Comparisons are provided to packages in other langauges. Comprehensive examples are provided. The documentation and paper are well written and organized.

Additional comparisons and contrasts to existing packages in the Julia ecosystem would be appreciated as others have noted in this review. I also suggested a few minor changes to make it easier to run the examples by referencing data files relative to the installed package.

If the concerns raised above in my own review and those are the other reviewer are addressed, I would recommend the paper for publication.

I apologize for the delay in reviewing the submission.

@mkitti
Copy link

mkitti commented May 5, 2024

Minor correction

MarSwitching.jl is a package for estimating Markov switching dynamic models (also called regime switching) for Julia. This is a class of models with time-varying coefficients depending on an unobservable state/regime that follows [a] Markov process. The package provides tools for estimation, inference and simulation of the models.

The "a" should be inserted.

@y1my1
Copy link

y1my1 commented May 5, 2024

Thank you for your comments y1my1 and sorry for delayed response, I was busy with my Phd.

Regarding the functionality comment. The way you defined a model with MSModel function won't give you the same one as in statsmodels. Here is a correct command:

model = MSModel(df[2:end, 2], 2, exog_switching_vars = df[begin:end-1, 2], switching_var = false)

exog_vars is for non-switching exogenous variables and as the equation in the example suggests the variance of the model is not regime-dependent (thus I specified switching_var = false). A default value is a regime-dependent variance. I am very much open to suggestion on how to make the function more intuitive to use :)

That being said, a current version of the package still gives some of the estimates a bit off than in statsmodels:

image

The one intercept now is way different. I think some time ago I compared it but disregarded the issue as the rest of the examples matches my package. Plus, the Matlab package MS_Regress also had different estimates. But anyway, I checked my code and the issue was solved in the most recent commit (m-dadej/MarSwitching.jl@6fa582d). (tl;dr Before the commit, I was adding a small number to state probability to solve an underflow error that have been happening some time ago).

If you install the github version of the package you can now replicate the results from statsmodels.

Regarding the performance comparison. The code for benchmark is in benchmark folder. The readme example can give different results as benchmark (in terms of relative performance), because the model is different (less complex). In the benchmark code I estimate 3 regime, 1 non-switching and 1 switching exogenous variable model.

That being said, I compare the readme example and the relative performance between statsmodel is smaller. MarSwitching is ~2x faster. But I am unable to replicate your results, even though the code on your screen shoot seems to be fine.

About community guidelines, it's a good idea. I will write them soon.

The performance difference we saw may be because of the different environments we are using. I updated Julia to the newest version (1.10.3) and was able to see MarSwitching runs 3 times faster compared with statsmodel. I would recommend adding a note of what environment is used. In addition, I am not sure I see people use "," as equivalent to "." anywhere, e.g., "6,7", which is hard to read, at least to me.

image image

@jbytecode
Copy link

@m-dadej - Please consider the changes and additions suggested by our reviewers, and then ping us again. Thank you in advance.

@m-dadej
Copy link

m-dadej commented May 8, 2024

Hey @jbytecode I added changes according to the reveiwer'ssuggestions.

@mkitti project.toml and manifest.toml are now in the examples folder and the scripts now have the path generating function you sugested. The contribution guidelines are also in docs now.

About the HiddenMarkovModels.jl comparison. The last paragraph in statement of need now reads:

Despite the popularity of the method, MarSwitching.jl is, at the moment, the only package that allows for effortless estimation of Markov switching models with Julia programming language. At the same time, it is implemented purely in this language. For more general modeling with hidden Markov models, Julia users may find HiddenMarkovModels.jl [Dalle2024] package useful as well. HiddenMarkovModels.jl offers more generic approach to programming hidden Markov models, albeit requiring user-side development of certain estimation algorithms for Markov switching models, as well as model inference functions.

I did not elaborate further as the paper is already pretty long. Of course I am open to any suggestions.

@y1my1 I added the information regarding the environment. The paragraph below the benchmark table now reads:

Software versions: MarSwitching.jl v0.2.2, Statsmodels v0.14.1, MSwM v1.5, MS_Regress v1.11. The programming languages versions were: Julia v1.10.1, Python v3.12.2, R v4.2.1 and MATLAB vR2024a. Calculations were run on: Windows 11 x64 Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz, 2592 Mhz, 6 Core(s), 12 Logical Processor(s).

The commas are removed as well.

@jbytecode
Copy link

@y1my1, @mkitti - Dear reviewers, how are things going on? Is there anything to update? Could you please update your status? Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Julia Matlab review TeX Track: 4 (SBCS) Social, Behavioral, and Cognitive Sciences
Projects
None yet
Development

No branches or pull requests

5 participants