Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread for CUSP winter 18-19 check in meetings #9

Open
3 tasks
patwater opened this issue Nov 5, 2018 · 18 comments
Open
3 tasks

Thread for CUSP winter 18-19 check in meetings #9

patwater opened this issue Nov 5, 2018 · 18 comments

Comments

@patwater
Copy link
Contributor

patwater commented Nov 5, 2018

Scribe takes notes. (Feel free to rotate each meeting). Each meeting notes should include:

  • Who attended?
  • Discussion of progress since last meeting
  • Tasks to complete by the next meeting

In addition, as available throughout the week, please share data visualization results in the issue thread linked here

@patwater patwater changed the title Check-in meeting 1 Thread for CUSP winter 18-19 check in meetings Dec 4, 2018
@patwater
Copy link
Contributor Author

  • ALL please read the GitHub Documentation for Governments and API (if time)
  • Wenjie ( @wz1405 ) will take notes in future meetings
  • Weekly call at 10 AM PST / 1 PM EST on Fridays
  • PA to create a slack channel for the group

CC @williamburgson @jianweili0

Thanks much!

@wz1405
Copy link
Collaborator

wz1405 commented Dec 14, 2018

My email is wz1405@nyu.edu at your convenience.

@jianweili0
Copy link

My email is jl9200@nyu.edu. If you are sending the slack invitation

@patwater
Copy link
Contributor Author

Great I sent the slack invite the the emails in the calendar invite. For future reference you may not want to share your email on GitHub as its public facing. Probably fine though can be an invitation to get spam mail. Thanks much!

For future notes, for each meeting notes please include:

  • Who attended?
  • Discussion of progress since last meeting
  • Tasks to complete by the next meeting

Thanks much!

@wz1405
Copy link
Collaborator

wz1405 commented Dec 21, 2018

Date: December 21st, 2018
Time: 13:30 - 14:00 EST
Attendance: Jianwei Li, Wenjie Zheng, Guanjia Wang, and Patrick Atwater

Progress: Overview
Several Pieces of Notes:
-> GitHub API requires Bash Commands
Q: Empty GitHub and empty files count towards data frame?
A: Yes
Q: Github file/project progress/code sharing
A: Through GitHub
-> branch -> push to the branch CUSP Winter 2018
-> be sure to check the latest version of the branch before starting to work
-> Inventory Country Names crawled.
-> A total of 1159 Agency / Org / Government
Q: Scrape all (record all)
A: (tag) for the specific type of i.e., agency/government included for the final data frame.
-> Number of contributors
-> make the initial data frame expansive (include as much information as we can?) and filter them later on.
-> Python library: py GitHub
-> Credential Issue!
-> Graham Henke for Reference
Plan:

  1. Communicate through Slack Channel.
  2. GitHub branch set up.

@wz1405
Copy link
Collaborator

wz1405 commented Dec 28, 2018

Date: December 28st, 2018
Time: 13:00 - 13:40 EST
Attendance: Jianwei Li, Wenjie Zheng, and Guanjia Wang

  1. Technical difficulties discussed
    • Regarding Push Access issue on PyGithub
  2. Missing 'number of contributors'
  3. Visualization questions refer to "Thread to share -- Mapping and visualizing data collabs  #13"
    • further ideas beyond preliminary questions

@patwater
Copy link
Contributor Author

patwater commented Dec 31, 2018

@wz1405 what was the push access issue? LMK if I can be helpful there

Also curious: what were the other visualization ideas discussed?

@wz1405
Copy link
Collaborator

wz1405 commented Jan 1, 2019 via email

@wz1405
Copy link
Collaborator

wz1405 commented Jan 7, 2019

Date: Jan 4th, 2019
No meeting was held.

@patwater
Copy link
Contributor Author

patwater commented Jan 7, 2019

@wz1405 roger. Any updates or need anything from me?

@jianweili0
Copy link

jianweili0 commented Jan 11, 2019

Date: January 11st, 2019
Time: 13:00 - 13:30 EST
Attendance: Jianwei Li, Wenjie Zheng, and Patrick Atwater

To do list:

  1. Continue to get Unique Contributors for each Repos
  2. Add the commit data (using BeautifulSoup or PyGithub)
  3. Sum up the Watch, Star, and Fork data (using BeautifulSoup)

Technical difficulties discussed:

  1. Still having issues about scrapping the Unique Contributors for each Repos
  2. Try longer sleep time (longer than 1 hour), and run over days

Visualization questions refer to "#13"
further ideas beyond preliminary questions

@williamburgson
Copy link
Collaborator

Repo data uploaded as gov_repo_count.csv and gov_repo_spec.csv, code is uploaded as DataCollaboratives.ipynb. The notebook takes a very long time to run due to the GitHub api rate limit.

@patwater
Copy link
Contributor Author

Great! Time to get visualizing -- when do classes start btw?

@williamburgson
Copy link
Collaborator

Spring semester just started today

@patwater
Copy link
Contributor Author

Sounds good. Do you all have a desire to still work on this? (Fine if not though happy to help push this forward if you want to have something for your portfolio)

@jianweili0
Copy link

jianweili0 commented Jan 29, 2019 via email

@patwater
Copy link
Contributor Author

patwater commented Feb 7, 2019

Sounds good @jianweili0 any visualizations to share? Where does the dataframe you parsed live? Please feel free to push (particularly the data) to the branch you created :)

@williamburgson
Copy link
Collaborator

Visualizations are still in progress, and the dataframes are outputted to csv files: gov_repo_count.csv and gov_repo_spec.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants