Nielsen Scanner #8

wbinzhe · 2021-07-11T20:52:33Z

Movement datasets

why there were zero-unit observations?
proper way to aggregate? concerns: different units (counts, pounds).
price change = f(retailer strategy, supplier/brand strategy). For retailers' strategy toward climate risk, they can reset price, or change suppliers.

shoonlee · 2021-08-04T14:56:53Z

I've talked to my Nielsen friends and the SUL server seems to be good enough to try the Nielsen data cleaning. Please start data cleaning right away so that we can show something to Siqi next Thursday (Aug 12). I think we can start with the last 2-3 years (which are closer to Safegraph), show it to Siqi, and expand the analysis to earlier years later.
Aggregation is a good question. People construct a price index using Nielsen data. Beraja et al (2019) is widely cited for index construction. A forthcoming paper by Leung (at ReStat) has a replication code for price index construction following Beraja (and also for overall data cleaning) if you need guidance.
Again about the price index construction, we could start with the overall price index (putting every repeated product into the index basket) and for each category (e.g., medicine, food, general merchandise, etc in a CVS) within a given store later. Again for the Aug 12 meeting, we could do the overall price index.
The third point is also a good point. Consult other papers and see if they mention it (and if so how they handle it).

shoonlee · 2021-08-05T15:52:26Z

I think it might be helpful for you to create a few slides and talk through them in our Aug 12 meeting. I want you to cover (at least) the following:

How to aggregate price at store level (namely, how to construct price indexes)
- To make it concrete, a toy example would be very helpful here
Some initial results (in a similar specification as before - regressing temperature on prices and revenues) with a sample of data
- Depending on the processing time, you could use a sample of categories for the last few years
Overall plan (including timeline) with the Nielsen data cleaning and analysis

wbinzhe · 2021-08-05T18:33:59Z

@wbinzhe

I think it might be helpful for you to create a few slides and talk through them in our Aug 12 meeting. I want you to cover (at least) the following:

How to aggregate price at store level (namely, how to construct price indexes)

To make it concrete, a toy example would be very helpful here

Some initial results (in a similar specification as before - regressing temperature on prices and revenues) with a sample of data

Depending on the processing time, you could use a sample of categories for the last few years

Overall plan (including timeline) with the Nielsen data cleaning and analysis

@shoonlee Sounds good, thanks Seunghoon. I'll draft these slides for our meeting next Monday and by then I should have a better sense of what can I show to Siqi on Thursday.

shoonlee · 2021-08-05T19:05:30Z

Great. I created a folder "Binzhe" inside of "rmd -> slides" folder. Try making it using rmd so that we can easily add them into the main deck later.

…

On Thu, Aug 5, 2021 at 2:34 PM wbinzhe ***@***.***> wrote: @wbinzhe <https://github.com/wbinzhe> I think it might be helpful for you to create a few slides and talk through them in our Aug 12 meeting. I want you to cover (at least) the following: - How to aggregate price at store level (namely, how to construct price indexes) - To make it concrete, a toy example would be very helpful here - Some initial results (in a similar specification as before - regressing temperature on prices and revenues) with a sample of data - Depending on the processing time, you could use a sample of categories for the last few years - Overall plan (including timeline) with the Nielsen data cleaning and analysis @shoonlee <https://github.com/shoonlee> Sounds good, thanks Seunghoon. I'll draft these slides for our meeting next Monday and by then I should have a better sense of what can I show to Siqi on Thursday. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMM5CBHIFVNLR3GAL6HS3FLT3LKSFANCNFSM5AFUNBRQ> .

wbinzhe · 2021-08-11T17:29:45Z

@shoonlee Hi Seunghoon, I added the illustration of the Price Index Construction in slides #50-58 in G-slides. Now it only has 450 stores (~1% random sample), I will keep the program running till this evening to have more store samples and merge price index with temperature data.

shoonlee · 2021-08-11T18:20:42Z

Hi Binzhe, Thanks for putting this together. I think it might be helpful to add a more concrete example. Pick a product group (e.g., yogurt or dairy products depending on the actual level) and clearly show how the construction works. One thing a bit confusing for me was q_i, y-1 (average quantity sold in each quarter in the previous year). Do you take the average of the entire year or by each quarter? In other words, is q_i,y-1 different for each quarter or is this quantity the same as long as it's in the same year? Show a toy example would clarify these kinds of questions.

…

On Wed, Aug 11, 2021 at 1:29 PM wbinzhe ***@***.***> wrote: @shoonlee <https://github.com/shoonlee> Hi Seunghoon, I added the illustration of the Price Index Construction in slides #50-58 in G-slides <https://docs.google.com/presentation/d/14_aDxt2O_Le4mCJj4lBfuK-rG9gI6WA8U69lhPJajis/edit#slide=id.ge48c9d8e4f_0_0>. Now it only has 450 stores (~1% random sample), I will keep the program running till this evening to have more store samples and merge price index with temperature data. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMM5CBGGTYA7IJDNPLASQYDT4KXRJANCNFSM5AFUNBRQ> .

shoonlee · 2021-08-11T18:25:20Z

If you're unclear about what I mean by a toy example, see this video. https://youtu.be/-IvsuVtzGko

…

On Wed, Aug 11, 2021 at 2:20 PM Seunghoon Lee ***@***.***> wrote: Hi Binzhe, Thanks for putting this together. I think it might be helpful to add a more concrete example. Pick a product group (e.g., yogurt or dairy products depending on the actual level) and clearly show how the construction works. One thing a bit confusing for me was q_i, y-1 (average quantity sold in each quarter in the previous year). Do you take the average of the entire year or by each quarter? In other words, is q_i,y-1 different for each quarter or is this quantity the same as long as it's in the same year? Show a toy example would clarify these kinds of questions. On Wed, Aug 11, 2021 at 1:29 PM wbinzhe ***@***.***> wrote: > @shoonlee <https://github.com/shoonlee> Hi Seunghoon, I added the > illustration of the Price Index Construction in slides #50-58 in G-slides > <https://docs.google.com/presentation/d/14_aDxt2O_Le4mCJj4lBfuK-rG9gI6WA8U69lhPJajis/edit#slide=id.ge48c9d8e4f_0_0>. > Now it only has 450 stores (~1% random sample), I will keep the program > running till this evening to have more store samples and merge price index > with temperature data. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#8 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AMM5CBGGTYA7IJDNPLASQYDT4KXRJANCNFSM5AFUNBRQ> > . >

wbinzhe · 2021-08-11T18:46:42Z

@shoonlee Sure Seunghoon. Actually all numbers put in the slides are real observations from one specific store, let me directly present the calculations there.

wbinzhe · 2021-08-11T18:53:49Z

Hi Binzhe, Thanks for putting this together. I think it might be helpful to add a more concrete example. Pick a product group (e.g., yogurt or dairy products depending on the actual level) and clearly show how the construction works. One thing a bit confusing for me was q_i, y-1 (average quantity sold in each quarter in the previous year). Do you take the average of the entire year or by each quarter? In other words, is q_i,y-1 different for each quarter or is this quantity the same as long as it's in the same year? Show a toy example would clarify these kinds of questions.
…
On Wed, Aug 11, 2021 at 1:29 PM wbinzhe @.***> wrote: @shoonlee https://github.com/shoonlee Hi Seunghoon, I added the illustration of the Price Index Construction in slides #50-58 in G-slides https://docs.google.com/presentation/d/14_aDxt2O_Le4mCJj4lBfuK-rG9gI6WA8U69lhPJajis/edit#slide=id.ge48c9d8e4f_0_0. Now it only has 450 stores (~1% random sample), I will keep the program running till this evening to have more store samples and merge price index with temperature data. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMM5CBGGTYA7IJDNPLASQYDT4KXRJANCNFSM5AFUNBRQ .

@shoonlee q_{i,y-1} in equation 1 is the average of the previous year (i.e., same for all quarters in the same year). Both Leung (2020) and Beraja et al. (2105) use this weight without variation across quarters.

shoonlee · 2021-08-25T02:40:09Z

@wbinzhe

Can you look into the following two things? These are what we've already discussed in the meeting with Siqi but please let me know if further clarification is needed. Please give me a brief update on Friday.

Update revenue plot and fix potential errors as necessary (the one in #77-79)
- Create similar plot using quantity sold as an outcome variable for the selected three groups
Update price index figure at the store level (the one in #67)

shoonlee · 2021-08-28T01:35:42Z

@wbinzhe

Following up on our conversation today, can you try making graphs about the attrition rate as described below? By attrition rate, I mean the percentage of goods that are not in the base basket (e.g., in year t+1 basket, only 80% of goods overlaps with the base basket goods -> attrition rate is 20%).

It will be a nice summary of the data as well as a useful sanity check of what we're doing. I think we can create these before running the time consuming part of the code #4, #5.

For code 4 (fixed basket), can you create a plot of attrition rate over time by each product group? You can pick 5 product groups (choose 5 including the three you've used before) for this exercise. Suppose we start from 2006 (or the earliest year yoo have already cleaned). As we add more years, attrition rate should be weakly increasing over time.
For code 5 (chain basket), create a plot of year-to-year attrition rate by each of the five product groups (calculate attrition rate between 2006-2007 and 2007-2008, etc). If there's any outlier either within product group or arcoss product group, investigate them. I think it should be roughly the same over the course of years for each product group although there might be substantial level differences across product groups.

Let me know if any clarification is needed. Thanks!!

wbinzhe · 2021-08-28T03:53:23Z

@wbinzhe

Following up on our conversation today, can you try making graphs about the attrition rate as described below? By attrition rate, I mean the percentage of goods that are not in the base basket (e.g., in year t+1 basket, only 80% of goods overlaps with the base basket goods -> attrition rate is 20%).

It will be a nice summary of the data as well as a useful sanity check of what we're doing. I think we can create these before running the time consuming part of the code #4, #5.

For code 4 (fixed basket), can you create a plot of attrition rate over time by each product group? You can pick 5 product groups (choose 5 including the three you've used before) for this exercise. Suppose we start from 2006 (or the earliest year yoo have already cleaned). As we add more years, attrition rate should be weakly increasing over time.

For code 5 (chain basket), create a plot of year-to-year attrition rate by each of the five product groups (calculate attrition rate between 2006-2007 and 2007-2008, etc). If there's any outlier either within product group or arcoss product group, investigate them. I think it should be roughly the same over the course of years for each product group although there might be substantial level differences across product groups.

Let me know if any clarification is needed. Thanks!!

@shoonlee Sure will also do it this Saturday!

wbinzhe · 2021-08-29T12:04:50Z

@wbinzhe
Following up on our conversation today, can you try making graphs about the attrition rate as described below? By attrition rate, I mean the percentage of goods that are not in the base basket (e.g., in year t+1 basket, only 80% of goods overlaps with the base basket goods -> attrition rate is 20%).
It will be a nice summary of the data as well as a useful sanity check of what we're doing. I think we can create these before running the time consuming part of the code #4, #5.

For code 4 (fixed basket), can you create a plot of attrition rate over time by each product group? You can pick 5 product groups (choose 5 including the three you've used before) for this exercise. Suppose we start from 2006 (or the earliest year yoo have already cleaned). As we add more years, attrition rate should be weakly increasing over time.

For code 5 (chain basket), create a plot of year-to-year attrition rate by each of the five product groups (calculate attrition rate between 2006-2007 and 2007-2008, etc). If there's any outlier either within product group or arcoss product group, investigate them. I think it should be roughly the same over the course of years for each product group although there might be substantial level differences across product groups.

Let me know if any clarification is needed. Thanks!!

@shoonlee Sure will also do it this Saturday!

@shoonlee I fixed the problem in sales (also price): in a paralleling step, the default orders of elements in input lists are not identical, causing problems in combining data of store i year 2018/2019 with data of store j year 2016/2017. The group-level plots looks good now. And I will continue to work on the rest of the tasks today and let you know when they are done.

wbinzhe changed the title ~~Nielson Scanner~~ Nielsen Scanner Aug 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nielsen Scanner #8

Nielsen Scanner #8

wbinzhe commented Jul 11, 2021 •

edited

Loading

shoonlee commented Aug 4, 2021 •

edited

Loading

shoonlee commented Aug 5, 2021

wbinzhe commented Aug 5, 2021

shoonlee commented Aug 5, 2021 via email

wbinzhe commented Aug 11, 2021

shoonlee commented Aug 11, 2021 via email

shoonlee commented Aug 11, 2021 via email

wbinzhe commented Aug 11, 2021

wbinzhe commented Aug 11, 2021

shoonlee commented Aug 25, 2021

shoonlee commented Aug 28, 2021 •

edited

Loading

wbinzhe commented Aug 28, 2021

wbinzhe commented Aug 29, 2021

Nielsen Scanner #8

Nielsen Scanner #8

Comments

wbinzhe commented Jul 11, 2021 • edited Loading

shoonlee commented Aug 4, 2021 • edited Loading

shoonlee commented Aug 5, 2021

wbinzhe commented Aug 5, 2021

shoonlee commented Aug 5, 2021 via email

wbinzhe commented Aug 11, 2021

shoonlee commented Aug 11, 2021 via email

shoonlee commented Aug 11, 2021 via email

wbinzhe commented Aug 11, 2021

wbinzhe commented Aug 11, 2021

shoonlee commented Aug 25, 2021

shoonlee commented Aug 28, 2021 • edited Loading

wbinzhe commented Aug 28, 2021

wbinzhe commented Aug 29, 2021

wbinzhe commented Jul 11, 2021 •

edited

Loading

shoonlee commented Aug 4, 2021 •

edited

Loading

shoonlee commented Aug 28, 2021 •

edited

Loading