Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Panel splitter classes #1220

Merged
merged 7 commits into from
Nov 17, 2023
Merged

Feature: Panel splitter classes #1220

merged 7 commits into from
Nov 17, 2023

Conversation

emhbrine
Copy link
Contributor

@emhbrine emhbrine commented Nov 9, 2023

The original PanelTimeSeriesSplit class is refactored into three classes:

  • ForwardPanelSplit

    • Maintains the functionality of PanelTimeSeriesSplit with n_split_method="expanding"
  • KFoldPanelSplit

    • Maintains the functionality of PanelTimeSeriesSplit with n_split_method="rolling"
  • IntervalPanelSplit

    • Maintains the functionality of PanelTimeSeriesSplit with train_intervals

The split() methods now yield the splits instead of a returning a list of all splits.

@emhbrine emhbrine force-pushed the feature/splitter branch 4 times, most recently from 3c90dad to 1ef087b Compare November 10, 2023 10:16
Copy link

codecov bot commented Nov 10, 2023

Codecov Report

Merging #1220 (4a6755b) into develop (841a957) will increase coverage by 0.37%.
Report is 1 commits behind head on develop.
The diff coverage is 59.88%.

❗ Current head 4a6755b differs from pull request most recent head 1af7db2. Consider uploading reports for the commit 1af7db2 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1220      +/-   ##
===========================================
+ Coverage    75.97%   76.35%   +0.37%     
===========================================
  Files           60       60              
  Lines         5345     5349       +4     
===========================================
+ Hits          4061     4084      +23     
+ Misses        1284     1265      -19     
Files Coverage Δ
macrosynergy/learning/__init__.py 100.00% <100.00%> (ø)
macrosynergy/learning/metrics.py 19.48% <100.00%> (ø)
macrosynergy/learning/prediction_tools.py 15.92% <50.00%> (ø)
macrosynergy/learning/cv_tools.py 13.79% <33.33%> (ø)
macrosynergy/learning/panel_time_series_split.py 60.00% <60.00%> (ø)

Impacted file tree graph

@emhbrine emhbrine marked this pull request as ready for review November 10, 2023 12:58
@emhbrine emhbrine requested review from rushilg99 and a team as code owners November 10, 2023 12:58
self.n_splits = n_splits
self.n_folds = n_splits + 1

def split(
Copy link
Contributor Author

@emhbrine emhbrine Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add return type hints

@rushilg99
Copy link
Contributor

rushilg99 commented Nov 15, 2023

To be reviewed on my end now.

rushilg99
rushilg99 previously approved these changes Nov 17, 2023
sorted(Xy.index.get_level_values(0).unique())
)
real_dates = Xy.index.get_level_values(1).unique().sort_values()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add:
freq_est = pd.infer_freq(real_dates)
freq_offset = pd.tseries.frequencies.to_offset(freq_est)


xranges_train: List[
Tuple[pd.Timestamp, pd.Timedelta]
] = self._calculate_xranges(cs_train_dates, real_dates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add freq_offset as an argument to calculate_xranges

)

def _calculate_xranges(
self, cs_dates: pd.DatetimeIndex, real_dates: pd.DatetimeIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add argument
:param <pd.DateOffset> freq_offset: DateOffset object representing the frequency of the dates in the panel.

# A single contiguous range of dates.
if len(difference) == 0:
xranges.append(
(cs_dates.min(), cs_dates.max() - cs_dates.min() + pd.Timedelta(days=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace pd.Timedelta(days=1) with freq_offset

difference = difference[(difference >= cs_dates.min())]

xranges.append(
(cs_dates.min(), cs_dates.max() - cs_dates.min() + pd.Timedelta(days=1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace pd.Timedelta(days=1) with freq_offset

@emhbrine emhbrine merged commit c5c3327 into develop Nov 17, 2023
4 checks passed
@emhbrine emhbrine deleted the feature/splitter branch November 17, 2023 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants