Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assumed ordering is wrong #20

Closed
XilinJia opened this issue Mar 18, 2014 · 13 comments
Closed

assumed ordering is wrong #20

XilinJia opened this issue Mar 18, 2014 · 13 comments

Comments

@XilinJia
Copy link
Contributor

The indexing order seems to assume that the oldest data is at index n and the newest at index 1. But that is not how typical data are arranged. Typically, the oldest data is at index 1 and the newest at index n

In the typical ordering of the data, should the for loop be as follows?

for i in n:length(ta)
vals[i] = mean(ta.values[i-(n-1):i])
end
@milktrader
Copy link
Member

The current code is:

for i in 1:length(ta) - (n-1)
    vals[i] =  mean(ta.values[i:i+(n-1)])
end

So, using n=10, the first time it takes ta.values[1:10], the second calculation is ta.values[2:11], which is consistent with a moving window across which to take an average.

In your example, the first calculation would be for ta.values[-8:1], which will produce a Bounds Error

@XilinJia
Copy link
Contributor Author

The current code calculates the average of future data, if it is applied to typical time series and the outcome series is synced with the original by the index.

In my example, using n=10, the first time it takes ta.values[1:10], and the result is stored at vals[10]; the second calculation is ta.values[2:11], and the result is stored at vals[11]. This syncs with the original series.

@milktrader
Copy link
Member

Okay, I see that you start your iteration at n and not 1, my mistake.

@milktrader
Copy link
Member

The current code takes care of aligning the vals array with it's correct corresponding timestamp by chopping off the first n-1 dates.

This is not an ideal solution, given that the whole NA vs NaN issue has basically been avoided by ignoring it.

If TimeSeries forces the values element to be a Float, then it will make sense to solve this by initializing an array of NaN, thereby conflating the definition of NaN with NA. In this case, your example of starting to replace an initialized array at the n index will be correct.

@XilinJia
Copy link
Contributor Author

I think it is very important to sync these series by the index. This would help tremendously when say one wants to determine whether ta.values[j] is above a corresponding sma, in which case he would only compare ta.values[j] with sma.values[j].

@milktrader
Copy link
Member

TimeSeries currently takes care of this by ensuring that timestamps align.

(I did find a display error though while using this example

using MarketData

cl .> sma(cl, 10);

@milktrader
Copy link
Member

While I work on displaying bool values in TimeSeries, here is a way to view the result above

julia> cl[findwhen(cl .> sma(cl,10))]
268x1 TimeArray{Float64,1} 1980-01-16 to 1981-12-10

             Close
1980-01-16 | 111.05
1980-01-17 | 110.7
1980-01-18 | 111.07
1980-01-21 | 112.1

1981-12-04 | 126.26
1981-12-07 | 125.19
1981-12-09 | 125.48
1981-12-10 | 125.71

@XilinJia
Copy link
Contributor Author

How do we do element-wise operation then?

@milktrader
Copy link
Member

The dot in front of the operator. .+, e.g.

@XilinJia
Copy link
Contributor Author

XilinJia commented Apr 3, 2014

To me that is array-wise per element operation, a preferred way of doing things in R because otherwise it would be slow. Shouldn't Julia be more flexible and provide real element-wise operations?

@milktrader
Copy link
Member

Vectorizing in R ( ie passing loops to C) is necessary and there isn't any reasonably fast alternative. It turns out though that vectorizing isn't always efficient. This is a good question for Julia-users groups.

@milktrader
Copy link
Member

Also, see Devectorize.jl package by @lindahua for an explanation

@milktrader
Copy link
Member

stale issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants