-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved data module #127
Merged
Merged
Improved data module #127
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
used script git-split.sh #!/bin/sh # used https://stackoverflow.com/questions/3887736/keep-git-history-when-splitting-a-file if [[ $# -ne 2 ]] ; then echo "Usage: git-split.sh original copy" exit 0 fi git mv $1 $2 git commit -n -m "Split history $1 to $2" REV=`git rev-parse HEAD` git reset --hard HEAD^ git mv $1 temp git commit -n -m "Split history $1 to $2" git merge $REV git commit -a -n -m "Split history $1 to $2" git mv temp $1 git commit -n -m "Split history $1 to $2"
test cvxportfolio/tests/test_data.py TestData.test_yfinance_download became fragile, need to understand why
names that get historical data trimmed down are HUBB, JCI, NVR, and seem reasonable
enzbus
added a commit
that referenced
this pull request
Feb 17, 2024
Few changes to the SymbolData and YahooFinance classes following PR #127 (improved data module); lots of new testcases. There was an incident yesterday with one of the larger example strategies, online update of a symbol failed to correctly ffill. This should have been fixed. Cleaning and filling on update of YahooFinance has been improved in various ways and tested much more. Note that this is specific to *updating* already downloaded data, not downloading from scratch (that was already tested with #127).
enzbus
added a commit
that referenced
this pull request
Feb 21, 2024
This minor release contains some new features, as expected under semantic versioning. The two major pull requests merged are #126 and #127. We significantly enhanced the built-in forecaster classes, like HistoricalMeanReturn and HistoricalCovariance, which are used by Cvxportfolio objects (ReturnsForecast and FullCovariance, for example) as default forecasters. These now take new “rolling” and “half_life” parameters, used to specify the length of the historical period used for estimation (by default, all history available at each point in time) and the half-life of exponential smoothing applied (by default, no smoothing), respectively. These apply to all forecasters. Internally, these more complex estimations are done with minimal extra computation (by updating, at each point in time, the estimation from the previous step). These parameters can be specified by instantiating the forecasters explicitely before passing them to each Cvxportfolio object (as is explained in the documentation). Thanks to the improved forecasters, we enhanced the trading cost classes TransactionCost and HoldingCost, clearing up some minor discrepancies with their original (2016) implementation. For TransactionCost, most notably, it is now possible to provide all forecasts without relying on internal forecasters. This will make it easier to translate the code of the original examples to the stable API. The changes are backward compatible with the documented interfaces of the previous versions. One interesting change we introduced on the cost objects is that now the same code paths are used to compute the cost values in simulation and optimization, further improving safety and auditability of the library. Finally, we re-wrote the data cleaning and data quality check code applied by the YahooFinance default data interface to US and international stock market data. This has been partially factored out in a new base class for open-low-high-close-volume data, which we expect to use for other (future) data interfaces. The cleaning is done by first removing impossible observations for stock market data (non-positive prices, …), then removing unlikely data (prices that imply 100x returns, …) with threshold based-testing (where all thresholds can be modified by the user), and finally by filling missing values being careful to avoid look-ahead biases (open prices that are missing get filled with close prices from the day before, …). A new example, “data_cleaning.py”, can be used to see what exactly is being done on each given name. Logs are also highly informative. All this is thoroughly tested both by the unit tests and by the example strategies that are run on every trading day a few minutes after the open time of both the US and now also international stock markets: we added a daily strategy that is run on the FTSE100 universe of the London stock market. Other minor edits and bug fixes are also present.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Better pull request (replaces #125)