-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential of daily wholesale data (2005-2014) for prediction vs. daily retail (2009-2013) #26
Comments
Do you think we should just pick the K best time-series and attempt to On 05/09/2014 01:07 AM, chingchia wrote:
|
@mstefanro, I would say that it is the way to go given the time constraints and the quality of the data. We can choose specific series and additionally try to feed in prices in neighbouring regions, social media indicators and weather data with a researched set-off. @ChingChia Are there more of these series for wholesale data? Should we help checking the data to filter out good series or is the number very limited? For Delhi the time series is also potato? Considering frequency of consumption: potato, onion and apple are good choice, it would also be nice find good series for rice, wheat and lentils |
@Fabbrix Please check this table to get a sense of data availability of the wholesale dataset. Each cell represents the best valid-data-rate of each (product, subproduct) of each region. (note that 0.9=90%) If it's empty, it means that there is no series with more than 60% of valid data. From the table, we have a few good individual series of rice and wheat (shown in the previous graphs), and some not bad ones having about 70%-80% of valid data. A table inferring data availability of wholesale daily: The same table for retail daily : I will add some more graphs of the retail daily later. |
@ChingChia
|
wholesale daily regional plotshttps://www.dropbox.com/s/25vd8fg5cznqqap/wholesale_daily_regional_plots_0.6.zip
|
Wholesale daily product plots
RiceWheatApplePotatoOnion |
Usability review of selected wholesale series: |
The complete bundle of plots and tables for wholesale and retail, daily and weeklyincluding: Datasets:
Selected products = [Rice','Wheat','Apple','Potato','Onion']
link: https://dl.dropboxusercontent.com/u/29566584/wholesale_retail_daily_weekly.zip |
Daily retail: The regional best plot of onion seems to show a general country pattern while the standard deviation for rice and wheat stays more or less stable over the period with increasing prices (could try and match to inflation). But maybe we're introducing a bias by selecting regional best. Weekly: The price per product plots are very nice: For the weekly data they show that the onion price is very volatile but stable across regions, while the prices for rice and wheat are less volatile, however vary greatly across regions. Potato also very volatile and some difference between regions. Also inflation seems to manifest itself more in the price of rice and wheat than in the price of potato and onion. Empirically motivate the choice of granularity: Time Series analysis of volatility granularity? For the network I think it is not too important with which offset exactly we feed in the weather data, because we have the reservoir has a memory property. |
By looking at every series above 60% valid rate in the 3 datasets. I realized that
2. In Wholesale daily, merge series to construct an extended dataset with a more uniform profile over products and regionsMerge series with more than 60% of valid data of the same product within each region by averaging, to get (Uttar Pradesh, Potato), (West Bengal, Rice), etc. plots per region |
Before I dig into prediction, share and discuss some thoughts.
We have wholesale daily (2005-2014) and retail daily (2009-2013) datasets.
1. Include a few very good wholesale daily series into prediction goals
The wholesale daily dataset is sparse, but we have some very good series with more than 80%~90% of valid data in over 10 years which also appear very volatile and periodic. Although they are only tiny portions of the whole picture, I suggest we could still make good use of them to produce individual predictions.
Pre-interpolation graphs per region (zoom in or click it to see clearer graphs):
Uttar Pradesh
Apple and onion appear volatile and periodic, but we should discard the rice here, since its price is very stable.
West Bengal
Observe the periodic clustering of high volatility.
Gujarat
Super volatile potato.
NCT of Delhi
Wheat price
Some more to come tomorrow.
The text was updated successfully, but these errors were encountered: