Skip to content

pushed_at or created_at? #27

@leebrian

Description

@leebrian

I think we should use created_at instead of pushed_at from the Github3 library.
https://github.com/LLNL/scraper/blob/c5a3373431db33ebc43a7b428c3649661ebeeec1/scraper/code_gov/models.py#L266

A developer is working on the TFS help_wanted (#23) and while testing I noted some odd behavior with GitHub repo created dates.

Looking at https://api.github.com/repos/dun/munge (I think this is a common test repo for you), I see the following attributes in the response:

 "created_at": "2015-05-15T22:16:19Z",
  "updated_at": "2019-01-31T01:40:47Z",
  "pushed_at": "2019-02-02T01:48:44Z",

which results in the following attributes in code.json

"date": {
                "created": "2019-02-02",
                "lastModified": "2019-01-31"
            }

This was confusing to me because the created is after the modified.

The library docs define pushed_at as "A parsed datetime object representing the date a push was last made to the repository."
and created_at as "A parsed datetime object representing the date the repository was created."

So I think created_at is the appropriate attribute. If you agree, I'll include this in my latest merge request that I'll include once we finish internal testing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions