Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bangladesh research - track data usage #27

Closed
cecilepompei opened this issue Apr 24, 2015 · 15 comments
Closed

Bangladesh research - track data usage #27

cecilepompei opened this issue Apr 24, 2015 · 15 comments

Comments

@cecilepompei
Copy link

Driver: @APLSmith
Team @benrito @Laurareynal

We need to find an app that tracks data usage or a way to remotely access the data usage for the research in Bangladesh

Timeline : next week :)

@APLSmith
Copy link

This seems to be the only app with .csv export.
https://play.google.com/store/apps/details?id=net.rgruet.android.g3watchdogpro&hl=en

Costs $3.

Anyone find any alternatives?

@lauradereynal
Copy link

What are we aiming to observe exactly ? What would we access ? For how
long ? Would it be anonymized ?

@thisandagain and @secretrobotron can you advise ?

Laura de Reynal
Webmaker - Field Research
@lau_nk https://twitter.com/lau_nk
Skype: laura-dereynal

On Mon, Apr 27, 2015 at 8:07 AM, Alexander P L Smith <
notifications@github.com> wrote:

This seems to be the only app with .csv export.

https://play.google.com/store/apps/details?id=net.rgruet.android.g3watchdogpro&hl=en

Costs $3.

Anyone find any alternatives?


Reply to this email directly or view it on GitHub
#27 (comment)
.

@APLSmith
Copy link

What are we aiming to observe exactly ? Data usage patterns, apps used
What would we access ? Nothing. At the exit interview, DNet would need to get a record of data used over period
Would it be anonymized? Yes - DNet wouldn't need to send us names

@thisandagain
Copy link

Thanks for clarifying @APLSmith .

Watchdog Pro is the best option that I know of based on your needs. Does our standard release cover a use case as potentially sensitive (PII + personal) as this? We need to make sure this works with our data retention policies and that we have an agreement / approach for anonymization in place with dnet.

cc @adamlofting @LauraReynal @davidascher

@davidascher
Copy link
Contributor

There's no way we can move on this and have an answer by next week. I'm not quite sure I understand the motivations anyway. Please, more justification for what this data would be useful, for whom, etc.

@APLSmith
Copy link

Thanks @thisandagain. This would obviously have to be incorporated into the release form.
@cecilepompei who is going to be responsible for that?

@davidascher the hypothesis here is that those in the treatment group will exhibit higher levels of data use and explore a wider range of services. This is primarily for the operators who very much think in those terms. Thoughts?

@cecilepompei
Copy link
Author

@APLSmith - you are in charge, it is the research part.
I have added it in our project management tool, line 13
https://docs.google.com/a/mozillafoundation.org/spreadsheets/d/1gcjul0xAQQZoVXsX9fRUoKMwNPt9swYrQSMd1bVnxyE/edit#gid=1985808196

On Tue, Apr 28, 2015 at 12:19 PM, Alexander P L Smith <
notifications@github.com> wrote:

Thanks @thisandagain https://github.com/thisandagain. This would
obviously have to be incorporated into the release form.
@cecilepompei https://github.com/cecilepompei who is going to be
responsible for that?

@davidascher https://github.com/davidascher the hypothesis here is that
those in the treatment group will exhibit higher levels of data use and
explore a wider range of services. This is primarily for the operators who
very much think in those terms. Thoughts?


Reply to this email directly or view it on GitHub
#27 (comment)
.

Cécile POMPEÏ
Mobile Service Lead, Webmaker.org
+44 75 98 490 115

@benrito
Copy link

benrito commented Apr 28, 2015

With opt-in and user consent, we are looking to record:

•total data consumption in 4 week cycle
•time series of how much data consumed over time
•per app breakdown

And compare a control group with a test group that receives a brief digital skills training.

This is the same data that is provided by the Android system data management monitor, but we are not aware of any way to export that data (hence the search for a third party alternative).

We are not interested in any of the content of the communications.

Although @APLSmith is helping, @thisandagain is flagging anonymization, data retention protocol.

@adamlofting can you advise?

@adamlofting
Copy link

I'm not sure who all the parties involved here are, and some of that will make a difference (mainly is Mozilla doing the research and collecting the data?).

These are the practices we need to comply with for collecting and storing data as MoFo: https://mana.mozilla.org/wiki/display/DATAPRACTICES/Foundation+Data+Practices (sorry behind LDAP)

If we were collecting just data consumption data (and presumably linking it with PII at least in the short term) that seems straightforward.

But logging app use and/or anything more granular which can indirectly say a lot about a person makes the data much more sensitive. Especially if this is a target audience needing digital skills training. I don't know who dnet are, but if they are doing the research, we need to check with Legal that our agreements cover anonymization as @thisandagain mentioned. This process is started by filing a Data Compliance bug info here

I would also check in with Legal about what happens when we actively install 3rd party software on other peoples devices. And who is responsible for what in that situation.


Separately, I'd love to see how we're planning the research.

@benrito
Copy link

benrito commented Apr 28, 2015

Hi @adamlofting,

Could you help be our sherpa? I believe there is minimal exposure if we design the research carefully, and want to avoid opening a protracted / abstract discussion with legal (which we know can happen).

Dnet is directly performing the research on behalf of Mozilla. They are a contractor with an agreement on file.

The plan is to:

1. form a control & test group that is representative of the kind of people who will be transitioning to smartphones in the next few years

2. invite them to a study, and secure their informed consent to participate in the study

3. provide them with a smartphone (which will become a gift if they complete the program), and a SIM card with a 2GB data plan

4. All participants do entrance interviews. Test group undergoes skills training.

5. All participants return for exit interviews after the trial period. Data is collected from phones.

Let’s have a discussion on what level of anonymization is necessary. I do believe that explicit, informed consent of users as to what data we are collecting, and why, should mitigate some of the things we’re trying to guard against in the data practices.

The app level feedback is meant to determine how / whether skills training will lead users to install and use more apps. Users in the developing world generally don’t install apps at all. We want to understand if some education during on-boarding would change that.

The brief is here and mostly up to date: https://docs.google.com/document/d/1_u8c0YplYl2dr6blWwGV5fDtIjhl8TFkEy-1vNoXfhY/edit#heading=h.9jfd597pkgh8

This study will form evidence for a final paper.

We are tracking the whole project here:
https://docs.google.com/a/mozillafoundation.org/spreadsheets/d/1gcjul0xAQQZoVXsX9fRUoKMwNPt9swYrQSMd1bVnxyE/edit#gid=1985808196

Cheers!
Ben

On Apr 28, 2015, at 10:34 AM, Adam Lofting notifications@github.com wrote:

I'm not sure who all the parties involved here are, and some of that will make a difference (mainly is Mozilla doing the research and collecting the data?).

These are the practices we need to comply with for collecting and storing data as MoFo: https://mana.mozilla.org/wiki/display/DATAPRACTICES/Foundation+Data+Practices https://mana.mozilla.org/wiki/display/DATAPRACTICES/Foundation+Data+Practices (sorry behind LDAP)

If we were collecting just data consumption data (and presumably linking it with PII at least in the short term) that seems straightforward.

But logging app use and/or anything more granular which can indirectly say a lot about a person makes the data much more sensitive. Especially if this is a target audience needing digital skills training. I don't know who dnet are, but if they are doing the research, we need to check with Legal that our agreements cover anonymization as @thisandagain https://github.com/thisandagain mentioned. This process is started by filing a Data Compliance bug info here https://mana.mozilla.org/wiki/display/DATAPRACTICES/Frequently+Asked+Questions
I would also check in with Legal about what happens when we actively install 3rd party software on other peoples devices. And who is responsible for what in that situation.

Separately, I'd love to see how we're planning the research.


Reply to this email directly or view it on GitHub #27 (comment).

@APLSmith
Copy link

In addition to @benrito above. A way we could collect this data is through a survey form like:
https://gsma.co1.qualtrics.com/jfe/form/SV_cINfZ78nVNHfDRb

DNet would enter the data directly into the above form (this is the GSMA's survey account - the survey itself can only be accessed by myself).

I actually met @adamlofting yesterday at Mozilla (nice to meet you Adam!). If you are willing to take this on, that would be great. What other information would you need?

@adamlofting
Copy link

All the extra context helps greatly with understanding the project (especially the informed consent and data minimization piece).

As discussed with @cecilepompei and @APLSmith in the office, I'm on PTO next week, but if you can prepare a 1-page explanation of the project by COP today I'll get this bug filed in the right place before I'm off. You'll need to keep this moving in my absence though.

I suggest using Ben's explanation above, but for each point also explain which organization is doing the work. And for the points where data is captured, note where it is being stored (e.g. GSMA's survey tool), and lastly note what data we as Mozilla are storing (based on the above it looks like it will be a very minimal anonymized dataset at the end so this shouldn't be problematic).


Comments from me not related to the legal bug:

  • That form probably needs an extra field to identify who is logging the data, and maybe also a unique participant ID - otherwise duplicate submissions, human error and or spam could make the whole dataset useless when it comes to analysis.
  • I'd also add a 'total number of apps used' field
  • When we talk about a research paper, are we talking about a peer-reviewed journal level of research paper? If our sample size for each group is ~25 people, we'd likely need to see an enormous uptick in data usage to get results with any statistical significance. It still makes sense to test initially with a group of this size, but the result might be 'recommend testing again with a bigger group to confirm results'.

@davidascher
Copy link
Contributor

davidascher commented Apr 30, 2015 via email

@APLSmith
Copy link

Thanks @adamlofting. I have sent you a 1 pager.

On your comments:

  1. Agree totally on the extra fields (what I put up was just a rough sketch).
  2. This won't be a peer reviewed piece. Completely agree on the point about statistical significance. This is meant to be a quick and dirty initial test - whatever the result, one of the recommendations will be

@davidascher

I think we will just use the data monitor already installed on all android devices. This should make the who thing simpler.

@adamlofting
Copy link

1 pager received, thanks. And the bug has been filed with legal.

I think it's in a good place, with a well defined scope and answers for the key points in our data practices. But if they have any concerns the should reach out to @cecilepompei or @benrito

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants