Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Design API of Markov Chains #16895

Open
czgdp1807 opened this issue May 26, 2019 · 78 comments

Comments

Projects
None yet
4 participants
@czgdp1807
Copy link
Member

commented May 26, 2019

References

Documentations
[1.1] https://reference.wolfram.com/language/guide/RandomProcesses.html
[2.1] https://reference.wolfram.com/language/ref/DiscreteMarkovProcess.html
[3.1] https://reference.wolfram.com/language/ref/ContinuousMarkovProcess.html

Examples
[4.1] https://www.utdallas.edu/~mbaron/3341/Practice8.pdf
[5.1] https://en.wikipedia.org/wiki/Examples_of_Markov_chains

Associated PRs
[1.2] #16852
[2.2] #16866

Proposed Class Structure

As can seen in [1.1], [2.1], [3.1], the class structure used by Wolfram Language & System, is as follows,

  1. RandomProcess - This is the super class having properties common to the various types of random processes like, RandomFunction, SliceDistribution, etc.
  2. MarkovProcess - This is the derived class and contains the concrete implementation of above said properties.

Since, in Phase - 1, I would be working with Markov Chain, I propose the following API, in line with [2.1], [3.1] and [4.1],

Discrete

DiscreteMarkovChain('<name>', i0, m) # i0 is the inital state and m is the transition matrix
DiscreteMarkovChain('<name>', p0, m) # p0 is the initial state probability vector and m is again the transition matrix
DiscreteMarkovChain(..., g) # graph g from which transition matrix will be computed. g can be in terms of dictionaries.

Continuous

ContinuousMarkovChain('<name>', i0, m) # i0 is the inital state and m is the transition matrix
ContinuousMarkovChain('<name>', p0, m) # p0 is the initial state probability vector and m is again the transition matrix
ContinuousMarkovChain(..., g) # graph g from which transition matrix will be computed. g can be in terms of dictionaries.
ContinuousMarkovChain(..., m, mu) # m is the transition matrix and mu is the set of transition states

Note - Both the above APIs will support finite state Markov Chains.
Infact, in most of the literature that I have come across, the finite state Markov Chains are the most popular.
In addition, the return types of all of the above queries will be vectors as is given in [5.1].

Other Comments

The Markov Chain will have concrete implementations of properties declared in RandomProcess class.
I prefer the name, MarkovChain, however, going with MarkovProcess is also feasible and valid.

There are many problems in [4.1] and [5.1] which require at least transition matrix for computations.

Checkpoints

[1] #16895 (comment)
[2] #16895 (comment) (waiting for final comments)

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 26, 2019

The simplest way to design a Markov Chain, is accepting a transition matrix and initial state distribution and returning the final distribution.
For Example take the simple weather model problem. The transition matrix is Matrix([[0.9, 0.1], [0.5, 0.5]]). Let's say the inital weather is sunny, so inital state distribution is [1, 0] , 1 for sunny and 0 for rainy. What will be the weather on say, day 2? Here is the simplest API,

>>> T = Matrix([[0.9, 0.1], [0.5, 0.5]])
>>> x = [1, 0]
>>> M = MarkovChain('M', T, x)
>>> P(M[1])
[0.9, 0.1]
@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 26, 2019

@czgdp1807 czgdp1807 changed the title [Discussion][GSoC] Design API of Markov Chains [Discussion] Design API of Markov Chains May 27, 2019

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

>>> P(M[1])
[0.9, 0.1]

This looks weird, probability is supposed to take a conditional and evaluate it. M[1] is a random variable on the state space of the Markov process. I would rather have

>>> P(Equals(M[1], 0))
0.1
>>> P(Equals(M[1], 1))
0.9
@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

At first glance, it looks like Wolfram is assuming that the state space is [1, 2, 3, 4, ... ] (in Python, start from zero).

Should we support arbitrary state spaces?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

About API

>>> P(Equals(M[1], 0))
0.1
>>> P(Equals(M[1], 1))
0.9

The above API is easier to implement. I would also go with this rather than the one returning vectors or lists.

About State Spaces
I don't support the arbitrary state spaces at least for DiscreteMarkovChain. The reason for that is, because we can map each state in arbitrary state space to the ones in [0, 1, 2, 3, . . .]. AFAIK, in DiscreteMarkovChain, the state spaces are supposed to be countable sets. What do you say?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

About Class Structure
I have two plans in my mind,

  1. Wolfram class structure - Honestly, it is a bit complex and directly implementing this from the start would make the maintenance of the code difficult. However, one favourable point for this is that it is already tried and tested by Wolfram, so adapting this may prove beneficial for the experienced developers who use Wolfram and SymPy both.
  2. Simple class structure - This would be very very simple. I will make only one class each for DiscreteMarkovChain and ContinuousMarkovChain having various properties and methods with a simple API as suggested in this comment along with the modifications in the above comment. The plus point of having a single classes for DiscreteMarkovChain and ContinuousMarkovChain will be the easier maintainability of code and avoiding complications caused due to inheriting classes of already complex stats module.

From my point of view, we should go with Simple class structure because it is easier to go from simple to complex but very hard to come back to simple from complex as is happening with stats module. Technically, refactoring simple structure is easier than the complex one. It will also allow us to focus more on considering the corner cases(like undefined transition matrix, state spaces, etc.) rather than focusing on handling a big code with a lot of inheritances. What do you say @Upabjojr ?
I will wait for a comment from the community until the mid of the week, otherwise I will assume from your previous suggestions and this comment that you also support my view and I will begin implementing the DiscreteMarkovChain followed by ContinuousMarkovChain. Thanks for the comments. :)

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

AFAIK, in DiscreteMarkovChain, the state spaces are supposed to be countable sets. What do you say?

By discrete what do you mean? Discrete time or discrete state space?

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@czgdp1807 you are still reasoning in an abstract way. Just get a collection of common problems with stochastic processes, and try to write draft usage cases for each one.

Class design is postponed.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

The possible values of Xi form a countable set S called the state space of the chain.

From wikipedia page of Discrete-time Markov Chain.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

I recommend to draw examples with E, P , var, covariance, and especially conditional probability. Try to match these cases with some real problems and write some usage examples. Let's proceed this way. Class design can be done after the API design.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

@Upabjojr I will take a problem set of Markov Chain and will give the fake example sessions(something like in this comment) of solving those problems, by tomorrow.
PS - I apologise for repeating the same thing again and again. :( I thought that I have to do the problems on my own and come up with a class structure.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

The possible values of Xi form a countable set S called the state space of the chain.

The state space may still be a countable subset of real numbers. I would recommend arbitrary set support. The transition matrix is going to require some indexing of this set. This can be checked after the API design.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

I thought that I have to do the problems on my own and come up with a class structure.

No, write them here. They could also form the initial draft for the final documentation.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

This can be checked after the API design.

Sure. I will first work on API design now. We will come back to this later.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

ping @Upabjojr
I have written the programmatic solutions of some of the problems related to Markov chains. Please read the exact problem statements and mathematical solutions in the links provided at the beginning of each section.

Module 4, Stochastic Processes, NPTEL

Q1

>>> state_space = {0, 1, 2} # given state space
>>> p0 = [1/4, 0, 1/2] # given initial probability vector
>>> T = Matrix([[1/4, 3/4, 0],
... [1/3, 1/3, 1/3], 
... [0, 1/4, 3/4]]) # given one-step transition matrix 
>>> X = DiscreteMarkovChain('X', state_space, T, p0) # creating a fake object
>>> P((X[0] = 0, X[1] = 1, X[2] = 1)) # currenty probability doesn't support conditions to be tuples, however this problem requires this to be done
0.3993
>>> P((X[1] = 1, X[2] = 1), X[0] = 0)
1/36
>>> P(X[4] = 1, X[2] = 2)
13/48
>>> P(X[7] = 0, X[5] = 0)
5/16

Problem Set 3, 18.445, MIT

Exercise 11

>>> state_space = {0, 1, 2} # given state space
>>> initial_state = 0 # given initial state
>>> T = Matrix([[0.7, 0.2, 0.1],
... [0.3, 0.5, 0.2],
... [0, 0, 1]) # given transition matrix
>>> X = DiscreteMarkovChain('X', state_space, T, initial_state) # fake Markov chain object
>>> t = X.first_hit_time(state = 2) # should return an unevaluated object
>>> t
FirstHitTime(state = 2)
>>> P(X[3] = 0, t > 3)
457/687
>>> P(X[3] = 0, t < 3) # not a part of the problem but showing it for clarity
0

Exercise 15

>>> state_space = {0, 1, 2} # given state space
>>> initial_state = 0 # given initial state
>>> T = Matrix([[0.7, 0.3, 0],
... [0, 0.6, 0.4],
... [0, 0, 1]) # given transition matrix
>>> X = DiscreteMarkovChain('X', state_space, T, initial_state) # fake Markov chain object
>>> t = X.first_hit_time(state = 2)
>>> t
FirstHitTime(state = 2)
>>> s = symbols('s', positive=True)
>>> E(s**t)
0.12*s**2/((1 - 0.7*s)*(1 - 0.6*s)

Problem Set 4, 18.445, MIT

Exercise 16

>>> S = {1, 2, 3} # given state space
>>> T = Matrix([[0.2, 0.4, 0.4],
... [0.1, 0.5, 0.4],
... [0.6, 0.3, 0.1]]) # given one-step transition matrix
>>> X = DiscreteMarkovChain('X', S, T)
>>> P(X[oo] = 1) # the chain will be in state 1 in the long run
11/39

Exercise 17

>>> S = {1, 2, 3, 4, 5} # given state space
>>> T = Matrix([[0, 1/3, 2/3, 0, 0],
... [0, 0, 0, 1/4, 3/4],
... [0, 0, 0, 1/2, 1/2],
... [1, 0, 0, 0, 0],
... [1, 0, 0, 0, 0]]) # given one-step transition matrix
>>> X = DiscreteMarkovChain('X', S, T)
>>> X.is_periodic
True
>>> X.is_reducible
False

Exercise 19

>>> S = {0, 1, 2, 3} # given state space
>>> T = Matrix([[0.1, 0.2, 0.3, 0.4],
... [0, 0.3, 0.3, 0.4],
... [0, 0, 0.6, 0.4],
... [1, 0, 0, 0]]) # given one-step transition matrix
>>> X = DiscreteMarkovChain('X', S, T)
>>> X.equilibrium_distribution
(20/63, 40/441, 15/49, 2/7)
>>> X.set_initial_state(2) # not a part of the problem but required since, every Markov chain is not required to have an initial state
@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

If the states are given as unordered sets how does the API know which one corresponds to which row/column of the transition matrix or probability vectors?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

@oscarbenjamin I think that the ambiguity can be removed by asking for an ordered set, because most of the problems which I have come across have given the state space as ordered sets.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

Infact, Wolfram doesn't asks for state space, it creates its own, so I think putting the restriction of ordered sets will not be wrong. Otherwise, we will have to do some extra computation, in terms of dict to keep a map of states and I think that will not be very useful instead it will increase the overhead.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

currenty probability doesn't support conditions to be tuples, however this problem requires this to be done

& operator creates And objects. That's the way.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

I don't like initial_state very much. What if a problem is stated in terms of a state that is not initial and you have to go backwards?

Can the initial state be defined in the conditions?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

Can the initial state be defined in the conditions?

Something like, P(condition=X[4], given_condition=X[1])?

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

Something like, P(condition=X[4], given_condition=X[1])?

Maybe something like

P(X[1], X[0] = SomeKindOfRandomVariable( ... ) )

This would also define the state space. I'm not sure about the transition matrix.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

I don't think that Markov chain ever attains a RandomVariable as value. Moreover, state space is the set of all the configurations a system can attain, so, how can it be random variable.
Moreover, it will be out of line with implementations in other symbolic systems like Wolfram and will make the things a bit difficult for the user. We should not keep Markov chains too complicated. Well that's just my point of view.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

In addition, state space should be defined by the user while creating the object for Markov chain.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

I don't think that Markov chain ever attains a RandomVariable as value

A Markov chain is a chain of random variables.

Moreover, state space is the set of all the configurations a system can attain, so, how can it be random variable.

Every Random variable has its own domain, which can be seen as a state space.

Moreover, it will be out of line with implementations in other symbolic systems like Wolfram and will make the things a bit difficult for the user.

The implementation by Wolfram is horrible.

We should not keep Markov chains too complicated. Well that's just my point of view.

Yes, let's constructors with many parameters.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

In addition, state space should be defined by the user while creating the object for Markov chain.

Maybe. I would prefer to find a workaround, e.g. SymPy to deduce the state space.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 30, 2019

I think state space is more analogous to domain rather than probability space.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 31, 2019

Let me suggest this structure:

  • StochasticProcess class, requiring only the variable name (e.g. X = StochasticProcess("X")), maybe with an optional state space parameter (otherwise assume all real numbers as state space).
  • MarkovProcess (or MarkovChain) inheriting StochasticProcess, and adding the transition probabilities as parameter (e.g. X = MarkovChain("X", transition_probs)).
  • MarkovProcessWithInitialState subclasses MarkovProcess and add the definition for the initial state.

Classes with less information can be turned into Markov chains by specifying additional conditions to P and E.

For example:

X = MarkovChain("X", transition_probs)
P(X[T+1] == <something>, X[T] == RV)

No methods called .set_<something>, please.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 31, 2019

@Upabjojr I support your suggestion. I will include it as a checkpoint. Do we need to discuss anything else on the API? I will wait for the comments for 24 hours, if no comments will be made, then I will raise an issue for discussion of class structure, assuming that the the above API is final.

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented May 31, 2019

This is the external API. What about probability spaces associated with stochastic processes? How would you implement them?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented May 31, 2019

A stochastic process is defined as a collection of random variables defined on a common probability space

As said in the wikipedia definition of stochastic process, it is clear that we don't need a new implementation of probability space. The constructor can pass existing probability space as the probability space for the stochastic process. We can also decide the probability space from the parameters at the time of query in P, E, etc.
Please give your suggestions.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented May 31, 2019

Is the StochasticProcess class intended to be used for continuous time/state stochastic processes as well? Otherwise maybe it should have a more specific name like DiscreteStochasticProcess

@jksuom

This comment has been minimized.

Copy link
Member

commented Jun 1, 2019

The sample space of a stochastic process is huge. Even in case of a Markov chain it is the space of all mappings from the discrete parameter space T (usually a range of integers) to the state space. A stochastic process can be conceptually defined by giving a probability measure on the sample space (a sigma-algebra of measurable subsets plus a positive measure of total measure 1).

The sample space is too big for implementation in a computer. The practical solution is the use of finite dimensional "slices" of the space, that is, a joint distribution of (X[t1], ..., X[tk]) for each finite series t1 < ... < tk of parameters. These are the (multidimensional) marginal distributions of the probability distribution on the big sample space. They satisfy a set of consistency conditions which mean that the joint distribution of a number of variables will determine that of any subset of those variables. For practical purposes, the most important result is that these joint distributions suffice to define the stochastic process (both discrete and continuous). That follows from the Kolmogorov extension theorem. Therefore, it is not necessary to consider infinite dimensional sample spaces explicitly. The finite dimensional product spaces currently defined for joint distributions will suffice.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 1, 2019

The finite dimensional product spaces currently defined for joint distributions will suffice.

Agreed.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 2, 2019

Probability Spaces encode processes that equal different values
probabilistically. These underly Random Symbols which occur in SymPy
expressions and contain the mechanics to evaluate statistical statements.

As far as I can figure out from the above snippet of docstrings of PSpace in sympy.stats.rv, probability spaces are used for defining the mechanics to evaluate statistical queries, like, P(X > n). I studied about Kolmogorov extension theorem and discovered that is used for proving the existence of stochastic processes like, Markov chains, Wiener Process.
Coming to the probability space of stochastic process, we need to define it in such a way that it is capable of handling the queries like in this comment. As @jksuom said that sample space or according to the terminology used in the code base of sympy.stats, the domain of stochastic processes is too huge, due to the set I X S, where I is index set and S is the common sample space among the random variables of the process. So, defining probability measure on such a huge set is nearly impossible on computer. The other way would be to evaluate the queries as they come. For example, in Markov chains the queries are usually of the type, P(Eq(X[p] = m), Eq(X[r] = n)) which can be evaluated by using the transition probabilities provided. This can be achieved by overriding the probability method of the PSpace. When it comes to generalised stochastic process then it should use the currently defined joint distributions and return the results.

I suggest the following for the probability space of stochastic process,

  1. There can be a class StochasticPSpace inheriting ProductPSpace.
  2. It should override the methods like domain, probability, compute_density, compute_expectation, etc.
  3. For clarity, the domain should return the ProductDomain representing I X S. The compute_density should return the joint density of a finite collection of random variables, using the currently defined joint distributions. Similarly, compute_expectation can return the result for queries like, E(X[t]) and similar.

Since, I will be handling Markov chains, so, I have designed API and the algorithms for handling the queries related to it. See https://github.com/sympy/sympy/wiki/GSoC-2019-Statistics-Plan for more details of the work division.

Please correct me if I am wrong anywhere. ping @Upabjojr

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 2, 2019

The sample space is too big for implementation in a computer

I'm not sure I understand what you mean by this. There are straight-forward continuous time stochastic processes for which reasonable calculations are possible. The obvious simple examples would be Weiner processes and Poisson processes with continuous time/continuous state and continuous time/discrete state respectively. The samples spaces for these are large but the increments are straight-forward, following either the normal or the Poisson distribution.

Maybe I misunderstand but it seems that you are suggesting that it isn't possible for SymPy to do reasonable calculations with such processes without reinterpreting them as discrete processes. The reason I brought this up is because I would expect these to have a different API/implementation much the same way that the stats module is already divided between continuous and discrete random variables.

Properly supporting (continuous) Weiner processes in particular is important for potentially being able to work with Ito/Stratonovich SDEs in future. I realise this is all out of scope for this project but while we're discussing API...

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

I went through the exact definitions of Poisson process, Wiener Process I observed that there isn't a need a of DiscreteStochasticProcess or ContinuousStochasticProcess, because, the queries are of the type, P(N(t) - N(s) = k), an example from Poisson process or P(W(t) + W(s) > a) an example for Wiener process which can be solved by the current implementations in sympy.stats, there isn't a need for a separate framework for continuous and discrete stochastic processes. For example to solve, P(S(t) + S(s) > a), where S is a stochastic process, we can find the random variable, X = S(t) + S(s), and then use, P(X > a). Now, it doesn't matter whether X is discrete or continuous, as P(X > a) already has a mechanism for continuous and discrete random variables.
That's just my point of view. If something is wrong anywhere then please let me know.
In short, we can have only common framework, StochasticProcess.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Your examples already show a potential (reasonable) API difference which is that you're using __call__ for continuous time processes and __getitem__ for discrete time processes.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

Can you please elaborate more on how the continuous time process requires __call__ and discrete time processes requires __getitem__?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

I tried the following, and can see that __getitem__ can be used in both discrete time and continuous time processes. Similarly, __call__ work well for discrete time and continuous time processes.

Code

class Example:

    def __call__(self, t):
        return 1

    def __getitem__(self, t):
        return 2

E = Example()
print(E(1), E(1.5))
print(E[1], E[1.5])

Output

1 1
2 2

Please tell me if I am missing something.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

can be used

You can use all sorts of things and make it work. The question is what is the API that we want. Presumably you already found when looking into continuous processes that the standard notation there is W(t) or N(t) so that the process is expressed as a function of a real variable. On the other hand for discrete processes the standard notation is integer subscripts which in Python would look like X[n]. As a user those are the notations I would expect.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

Oh I see, I got your point. Thanks for the explanation. Now, I am also favourable towards using different classes for discrete and continuous stochastic process, at least to make the API decoupled for the two types. Though the logic for computation may remain the same. What do you say about something like,
StochasticProcess is inherited by DiscreteStochasticProcess and ContinuousStochasticProcess. The mechanics of computation can be made different wherever necessary.
Waiting for your views.

@jksuom

This comment has been minimized.

Copy link
Member

commented Jun 3, 2019

I'm not sure I understand what you mean by this.

I mean that I don't know how to implement elements of infinite dimensional function spaces. Only a finite number of values can be specified at any one moment (tuple, dict). But that is enough. In most cases at most two values are needed. for example, for increments:

The samples spaces for these are large but the increments are straight-forward,

I would consider the following api for a general stochastic process: A method that would return, for each finite series of parameters t1 < ... < tk, a joint distribution of k variables x1. ..., xk. (A subclass can transform them to random variables X[n] or W(t) if desired.) Alternatively, one could also start with another method that would return a distribution for xk with given x1, ..., xk-1. The joint distribution could then be constructed recursively for (x1), (x1, x2), (x1, x2, x3),... by multiplying such conditional distributions. The implementation would be simple in many common cases (Markov chains, Wiener process) as the conditional distributions would depend only on xk-1.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

I would consider the following api for a general stochastic process: A method that would return, for each finite series of parameters t1 < ... < tk, a joint distribution of k variables x1. ..., xk. (A subclass can transform them to random variables X[n] or W(t) if desired.)

Yes, there will be a function, joint_distribution which will do the same. Infact, this is one of the pre-planned features from my side to be added to stochastic processes.
Thanks for the clarification though.
Please let me know of any more additions.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

I would consider the following api for a general stochastic process

Okay I see what you mean. A general API can provide probabilistic queries based on a finite number of times for both discrete and continuous processes.

That makes sense and is a good starting point but there are still relevant computable queries that can't necessarily be expressed that way such as what is the probability that W(t) is less than 2 for all t in [0, 1]. Maybe something separate would be needed for hitting times.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

Maybe something separate would be needed for hitting times.

Yes, for hitting times, either we can use pre-computed results by using methods like, _hitting_time in a way we use _cdf. However, that will be quite ad-hoc. We can also develop a new algorithm. The middle path is to use the pre-computed results until we are able to develop an algorithm which is capable of computing required results. Though that's a different discussion. Currently, I believe at least basic structure should be implemented, so that basic probability queries are answered then we can move on to developing logic for hitting times.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Hitting time could be a random variable like:

W = WeinerProcess('W')
T = HittingTime('T', W, 2)
print(P(T > 1))

As for the general stochastic process API based on joint distributions. Do you mean that it returns a JointRV or JointDistribution, or a pdf or what? Are joint distributions implemented for discrete RVs?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 3, 2019

Hitting time could be a random variable like:

Nice suggestion, it can be done, I will make a separate PR for this, first let's complete the basic structure for basic queries.

Do you mean that it returns a JointRV or JointDistribution, or a pdf or what?

I would prefer JointDistribution.

Are joint distributions implemented for discrete RVs?

Yes, as far as I know.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Are joint distributions implemented for discrete RVs?

Yes, as far as I know.

Can you demonstrate how to create the object that would be returned for the joint distribution in the proposed API? Suppose I want to calculate P(Eq(X[4],1) & Eq(X[3],0))

@Upabjojr

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

Regarding the pspace of a stochastic process, we just need a proxy object to mimic the behaviour of slices on the underlying space. Obviously we are not going to implement an infinite dimensional space.

Regarding the API details for how methods are computed, @czgdp1807 you have much more freedom. What we need to fix is the API that end-users will be using, because SymPy is trying to keep the API stable (no sudden changes between versions).

Internal private methods can be changed freely even after releasing a new SymPy version.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 4, 2019

@Upabjojr
Is this proposal for probability space of stochastic process good enough for implementation. Please provide your suggestions.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 4, 2019

And is there are any other addition or changes you want in external API? Please provide your views on that too.

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 4, 2019

Can you demonstrate how to create the object that would be returned for the joint distribution in the proposed API? Suppose I want to calculate P(Eq(X[4],1) & Eq(X[3],0))

If X is Markov chain then joint distributions isn't needed for finding the answer to above query. The initial probabilities and transition probabilities are sufficient. However, for any other stochastic process, there is JointDistributionHandmade for custom joint distributions, where user can pass the PMF/PDF for creating the required distributions.
So if you know the PDF/PMF of X, you can do something like, JointDistributionHandmade(pdf) and the required object will be made.
More concretely, you can get a rough idea:

def joint_distribution(self, n):
        syms = [Indexed(self.name, k) for k in range(n)]
        pdf = Lambda(syms, Mul(*[self.pspace.distribution.pdf(sym) for sym in syms]))
        return JointDistributionHandmade(pdf)

The above is just for an idea, concrete implementation will include more customisation.

@oscarbenjamin

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

The JointDistributionHandmade class has is_Continuous=True in its definition. Can it be used for discrete RVs?

@czgdp1807

This comment has been minimized.

Copy link
Member Author

commented Jun 4, 2019

Yes, I have made some prototypes and things work out well. According to my experience, there are a few bugs in sympy.stats.joint_rv which can be removed as we keep working with it. The main thing is that JointDistributionHandmade needs pdf/pmf which can be passed by doing required computations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.