Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify classifications as coming from the app #11

Closed
willettk opened this issue Dec 5, 2014 · 20 comments
Closed

Identify classifications as coming from the app #11

willettk opened this issue Dec 5, 2014 · 20 comments

Comments

@willettk
Copy link

willettk commented Dec 5, 2014

From a data analysis standpoint, we'd like to know which classifications users made through the app as opposed to the browser. Can you add a field to the final upload that indicates this? Something like:

{
interface: "android_app"
}

or equivalent would be fine.

@camallen suggested that it'd probably be done here:

private boolean doUploadSync(final String itemId, final String subjectId, final String authName, final String authApiKey) {

murraycu added a commit that referenced this issue Dec 5, 2014
…he app.

Add:
interface: murrayc.com-android-galaxyzoo
to the parameters in the content of the POST to make it easier for
the server to identify the app as the source of the classifications.
See #11
@murraycu
Copy link
Owner

murraycu commented Dec 5, 2014

Gladly:
8b7ac85

I've use the existing User-Agent string instead of just "android-app" because it seems more specific in case there are ever more active apps. People might think they should reuse "android-app" but they wouldn't reuse someone else's domain.

So a classification now looks like this (in the POST's content):

interface:murrayc.com-android-galaxyzoo
classification[subject_ids][]:504e57f9c499611ea6019474
classification[annotations][0][sloan-0]:a-1
classification[annotations][1][sloan-1]:a-1
classification[annotations][2][sloan-2]:a-1
classification[annotations][3][sloan-3]:a-0
classification[annotations][4][sloan-9]:a-1
classification[annotations][5][sloan-10]:a-1
classification[annotations][6][sloan-4]:a-1
classification[annotations][7][sloan-5]:a-0
classification[annotations][8][sloan-6]:a-0
classification[annotations][8][sloan-6]:x-4
classification[annotations][9][sloan-11]:a-1

However, maybe you'd prefer it to be classification[interface] instead of interface?

For existing classifications, you can also use the User-Agent of the HTTP Post, if that gets through to your database:

public static final String USER_AGENT_MURRAYC = "murrayc.com-android-galaxyzoo";

@brian-c
Copy link

brian-c commented Dec 5, 2014

Galaxy Zoo is using an old branch of the main Zooniverse library, so it has its own Classification model which doesn't include the user agent.

Usually it's saved as an annotation (not a great place, but it's stuck for now) like this: https://github.com/zooniverse/Zooniverse/blob/master/src/models/classification.coffee#L91-L93

Which I think ends up looking like this:

. . .
classification[annotations][9][sloan-11]:a-1
classification[annotations][10][user_agent]:murrayc.com-android-galaxyzoo

Nothing outside the top-level classification key is stored.

@willettk
Copy link
Author

willettk commented Dec 5, 2014

I agree that storing user agent annotations are annoying (especially for
RGZ; I hate them so much), but I suppose we'll deal with it as long as it's
stored somewhere.

On Fri Dec 05 2014 at 8:41:08 AM Brian Carstensen notifications@github.com
wrote:

Galaxy Zoo is using an old branch of the main Zooniverse library, so it
has its own Classification model which doesn't include the user agent.

Usually it's saved as an annotation (not a great place, but it's stuck for
now) like this: https://github.com/zooniverse/Zooniverse/blob/master/src/
models/classification.coffee#L91-L93

Which I think ends up looking like this:

. . .
classification[annotations][9][sloan-11]:a-1
classification[annotations][10][user_agent]:murrayc.com-android-galaxyzoo

Nothing outside the top-level classification key is stored.


Reply to this email directly or view it on GitHub
#11 (comment)
.

@murraycu
Copy link
Owner

murraycu commented Dec 5, 2014

So I should add this parameter instead of the "interface" thing?

classification[annotations][the-last-number][user_agent]:murrayc.com-android-galaxyzoo

@willettk
Copy link
Author

willettk commented Dec 5, 2014

Think so. Will that work, @brian-c ?

On Fri, Dec 5, 2014, 13:56 Murray Cumming notifications@github.com wrote:

So I should add this parameter instead of the "interface" thing?

classification[annotations][the-last-number][user_agent]:murrayc.com-android-galaxyzoo


Reply to this email directly or view it on GitHub
#11 (comment)
.

@brian-c
Copy link

brian-c commented Dec 5, 2014

That should be fine. @willettk are you working with the data directly from Mongo? I though everybody got a nice CSV with labelled columns.

@willettk
Copy link
Author

willettk commented Dec 5, 2014

For GZ, yes. I work with the mongo data from radio galaxy zoo, where user
agent data is much more of a pain.

On Fri, Dec 5, 2014, 15:13 Brian Carstensen notifications@github.com
wrote:

That should be fine. @willettk https://github.com/willettk are you
working with the data directly from Mongo? I though everybody got a nice
CSV with labelled columns.


Reply to this email directly or view it on GitHub
#11 (comment)
.

@murraycu
Copy link
Owner

murraycu commented Dec 6, 2014

OK:
d025448

Here are the content parameters from an example classification POST:

classification[subject_ids][]:504e4706c499611ea600d59d
classification[favorite][]:true
classification[annotations][0][sloan-0]:a-1
classification[annotations][1][sloan-1]:a-1
classification[annotations][2][sloan-2]:a-0
classification[annotations][3][sloan-3]:a-0
classification[annotations][4][sloan-9]:a-1
classification[annotations][5][sloan-10]:a-5
classification[annotations][6][sloan-4]:a-1
classification[annotations][7][sloan-5]:a-1
classification[annotations][8][sloan-11]:a-1
classification[annotations][9][interface]:murrayc.com-android-galaxyzoo

I've uploaded a couple of classifications already so you can check it on the server.

@brian-c
Copy link

brian-c commented Dec 6, 2014

Sorry, that should be "user_agent", not "interface".

@murraycu
Copy link
Owner

murraycu commented Dec 6, 2014

Thanks for checking. Done: 7b3ee22

And I've uploaded some more classifications to test that.

@murraycu
Copy link
Owner

Could you please confirm that this is working for you?

@willettk
Copy link
Author

There are several classifications that now have the annotation marked as coming from Android. However, there are only 13 in the entire sample - I would have expected much more than that if you've deployed this to the full audience.

Also: the new classification document has additional data normally associated with the subject (example below). I don't know if it's a problem, but it does bulk up the data products somewhat unnecessarily - I thought normally that one would get that data by linking to the subject_id. Is there a particular reason that it's been added, @brian-c or @murraycu?

> db.galaxy_zoo_classifications.findOne({'annotations.interface':{$exists:true}})
{
    "_id" : ObjectId("5482b4c227b56239a200000c"),
    "annotations" : [
        {
            "sloan-0" : "a-1"
        },
        {
            "sloan-1" : "a-0"
        },
        {
            "sloan-8" : "a-1"
        },
        {
            "sloan-5" : "a-1"
        },
        {
            "sloan-11" : "a-1"
        },
        {
            "interface" : "murrayc.com-android-galaxyzoo"
        }
    ],
    "created_at" : ISODate("2014-12-06T07:48:18Z"),
    "favorite" : [
        "true"
    ],
    "project_id" : ObjectId("502a90cd516bcb060c000001"),
    "subject_ids" : [
        ObjectId("504e5d62c499611ea601b902")
    ],
    "subjects" : [
        {
            "id" : ObjectId("504e5d62c499611ea601b902"),
            "zooniverse_id" : "AGZ0002f42",
            "location" : {
                "standard" : "http://www.galaxyzoo.org.s3.amazonaws.com/subjects/standard/1237663229070082210.jpg",
                "thumbnail" : "http://www.galaxyzoo.org.s3.amazonaws.com/subjects/thumbnail/1237663229070082210.jpg",
                "inverted" : "http://www.galaxyzoo.org.s3.amazonaws.com/subjects/inverted/1237663229070082210.jpg"
            },
            "coords" : [
                279.399026983604,
                78.0015454472869
            ],
            "metadata" : {
                "counters" : {
                    "feature" : 24,
                    "smooth" : 13,
                    "star" : 13
                }
            }
        }
    ],
    "tutorial" : false,
    "updated_at" : ISODate("2014-12-06T07:48:02.702Z"),
    "user" : {
        "classification" : "feature"
    },
    "user_ip" : "88.217.180.214",
    "workflow_id" : ObjectId("50251c3b516bcb6ecb000002")
}

@murraycu
Copy link
Owner

There are several classifications that now have the annotation marked as coming from Android.

Good, so it's basically working.

However, there are only 13 in the entire sample

I bet most of them are me testing.

I would have expected much more than that if you've deployed this to the full audience.

There are still not that many people using it, and I'd expect a lot of people to install it and forget about it after playing with it. Broadly, about 100 people have had the new version for about a week, during a time when there weren't many new installs. I'd be fascinated to know what the visitor retention numbers are for the website.

Also: the new classification document has additional data normally associated with the subject (example below).

Is this specific to the classifications from the app? If so, then I think someone would have to look at what's happening on the server to trigger a change in behaviour.

@willettk
Copy link
Author

My bad - virtually all classifications (except for the first couple) do have that subject data in the classification document, so that's nothing specific to the app.

@camallen
Copy link

@willettk - re numbers of classifications, check for the user_agent key as per #11 (comment) and 7b3ee22

@willettk
Copy link
Author

Ah - thanks, @camallen! Looks like we have 5006 classifications so far from
the app; much more like what I'd thought.

On Wed Dec 17 2014 at 4:13:46 PM Campbell Allen notifications@github.com
wrote:

@willettk https://github.com/willettk - re numbers of classifications,
check for the user_agent key as per #11 (comment)
#11 (comment)
and 7b3ee22
7b3ee22


Reply to this email directly or view it on GitHub
#11 (comment)
.

@camallen
Copy link

Good stuff! Out of interest how many unique classifying app users?

@willettk
Copy link
Author

46 so far. Embarrassing that I'm apparently not among them; guess I haven't logged back in since the latest update.

db.galaxy_zoo_classifications.distinct('user_name',{'annotations.user_agent':{$exists:true}})

@willettk
Copy link
Author

About 27% (1354 classifications) on the app are from non-logged in users so far. 23 different countries, too (most are from Germany, UK, or the US).

@murraycu
Copy link
Owner

Looks like we have 5006 classifications

It's good to know people are using it. Thanks.

So, I think this is done. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants