Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction engine for time series. #1618

Closed
wants to merge 30 commits into from

Conversation

codemzs
Copy link
Member

@codemzs codemzs commented Nov 14, 2018

Prediction engine for time series that updates the time series transform state at the time of prediction. Next iteration will add more tests.

@codemzs codemzs requested a review from ganik November 14, 2018 17:58

if (rows.Count == 0 && outputRow is IStatefulRow)
rows.Add((IStatefulRow)outputRow);

Copy link
Contributor

@Zruty0 Zruty0 Nov 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like an unroll of 1 level of recursion. Can't GetStatefulRows also handle case where mapper is not a CompositeRowToRowMapper? #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still active


In reply to: 233642937 [](ancestors = 233642937)

Zruty0
Zruty0 previously requested changes Nov 14, 2018
Copy link
Contributor

@Zruty0 Zruty0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

if (prediction == null)
prediction = new TDst();

//Update State.
Copy link
Contributor

@Zruty0 Zruty0 Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//Up [](start = 12, length = 4)

// Update state. and // Predict. #Resolved

/// to get the dependencies on input columns and the getters for the output columns, given an active set of output columns.
/// </summary>

public sealed class TimeSeriesRowToRowMapperTransform : RowToRowTransformBase, IRowToRowMapper,
Copy link
Contributor

@Zruty0 Zruty0 Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimeSeriesRowToRowMapperTransform [](start = 24, length = 33)

Why did you need to clone it, and why is it public? #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll refactor this out and make it internal


In reply to: 235554740 [](ancestors = 235554740)

/// <summary>
/// Run prediction pipeline on one example.
/// </summary>
/// <param name="example">The example to run on.</param>
/// <param name="prediction">The object to store the prediction in. If it's <c>null</c>, a new one will be created, otherwise the old one
/// is reused.</param>
public void Predict(TSrc example, ref TDst prediction)
public virtual void Predict(TSrc example, ref TDst prediction)
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtual [](start = 15, length = 7)

If 2 existing implementations are dissimilar, what's the value in having a virtual method? Make it abstract #Resolved

@@ -7,30 +7,40 @@

namespace Microsoft.ML.Runtime.Data
{
public sealed class PredictionFunction<TSrc, TDst> : PredictionFunctionBase<TSrc, TDst>
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PredictionFunction [](start = 24, length = 18)

I honestly don't understand the purpose of PredictionFunction. Why introduce a hierarchy of useless classes? I would much rather rename PredictionEngine to PredictionFunction and be done with it.

It doesn't have to be done in this PR, but could you just use PredictionEngine only, to simplify that future change? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm with you 100% on this. I have the same thought when I was making this change.


In reply to: 236007234 [](ancestors = 236007234)

}

internal virtual void CreatePredictionEngine(IHostEnvironment env, ITransformer transformer, out PredictionEngineBase<TSrc, TDst> engine) =>
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreatePredictionEngine [](start = 30, length = 22)

use return value instead of out parameter #Resolved

/// <summary>
/// A chain of transformers (possibly empty) that end with a <typeparamref name="TLastTransformer"/>.
/// For an empty chain, <typeparamref name="TLastTransformer"/> is always <see cref="ITransformer"/>.
/// </summary>
public sealed class TransformerChain<TLastTransformer> : ITransformer, ICanSaveModel, IEnumerable<ITransformer>
public sealed class TransformerChain<TLastTransformer> : ITransformer, ICanSaveModel, IEnumerable<ITransformer>, ITransformerAccessor
where TLastTransformer : class, ITransformer
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing. You now can access the inner transformers in 3 ways: via foreach (as an IEnumerable), via the internal field Transformers and via ITransformerAccessor.Transformers. Let's reduce it to one. Maybe make in an IEnumerable of pairs, if you need access to scopes #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you could make Transformers and Scopes public and IReadOnlyLists


In reply to: 236007659 [](ancestors = 236007659)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this interface so that I can access transformers and scope from an ITransfomer reference because casting ITransfomer to TransformerChain is messy.


In reply to: 236007746 [](ancestors = 236007746,236007659)

@@ -306,10 +307,10 @@ public abstract class AnomalyDetectionStateBase : StateBase
protected SequentialAnomalyDetectionTransformBase<TInput, TState> Parent;

// A windowed buffer to cache the update values to the martingale score in the log scale.
private FixedSizeQueue<Double> _logMartingaleUpdateBuffer;
private FixedSizeQueue<Double> LogMartingaleUpdateBuffer { get; set; }
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogMartingaleUpdateBuffer [](start = 43, length = 25)

why this change? #Resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, could you add this assembly to be tracked by RoboTom?


In reply to: 236008010 [](ancestors = 236008010)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this value for saving the state in the case of IID. I will add TS binary to be tracked by RoboTom.


In reply to: 236008263 [](ancestors = 236008263,236008010)

@Zruty0
Copy link
Contributor

Zruty0 commented Nov 23, 2018

using Microsoft.ML.Core.Data;

summary comment #Resolved


Refers to: src/Microsoft.ML.TimeSeries/PredictionEngine.cs:1 in 7bf4c39. [](commit_id = 7bf4c39, deletion_comment = False)

public bool IsRowToRowMapper => false;
public bool IsRowToRowMapper => true;

public TState StateRef{ get; set; }
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public TState StateRef{ get; set; } [](start = 8, length = 35)

I don't see this as a good idea at all.
The whole cloning mechanism could be implemented completely internally, without making anything public, or mutable. Just don't use MemberwiseClone, and write it yourself. This also has the benefit of self-documenting what's being cloned and how. #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.


In reply to: 236008749 [](ancestors = 236008749)

@Zruty0
Copy link
Contributor

Zruty0 commented Nov 23, 2018

you'll need to revert your changes to libmf. #Resolved

{
private ValuePinger[][] _pingers;
private int _rowPosition;
public ITransformer CheckPoint => Transformer;
Copy link
Contributor

@Zruty0 Zruty0 Nov 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transformer [](start = 42, length = 11)

This would mean that if you checkpoint something, it's a mutable object. As far as I recall, we agreed to keep mutable transformers completely inside the prediction engine. This means that you need to CloneTransformers here.

Also, make it a method: ITransformer GetCurrentTransformer() or something like this. #Pending

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused but please help me understand your concern here. Isn't that what I'm doing below? I have changed the API for checkpointing where the user passes a modelpath and prediction engine writes the model with updated state there.


In reply to: 236009638 [](ancestors = 236009638)

@Zruty0
Copy link
Contributor

Zruty0 commented Nov 23, 2018

I finished reviewing the iteration 24 on 11/23. This code still has minor cleanup needed, but I'm more concerned of whether the batch prediction is affected (for some reason you had to skip the time series tests). #Resolved

@Zruty0 Zruty0 dismissed their stale review November 23, 2018 19:26

revoking review

@codemzs
Copy link
Member Author

codemzs commented Nov 26, 2018

Thanks, Pete. Batch Prediction is not affected, I have enabled batch prediction tests and they are passing.


In reply to: 441307172 [](ancestors = 441307172)

@codemzs codemzs closed this Nov 27, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants