Skip to content
This repository has been archived by the owner on Jan 15, 2018. It is now read-only.

Commit

Permalink
Bring some life back into this project.
Browse files Browse the repository at this point in the history
1. Updated the parsing code to work with latest codeplex.
2. Switched to use the github3 python library https://github.com/sigmavirus24/github3.py
  • Loading branch information
mmanela committed Sep 15, 2014
1 parent 3136eff commit 0e48da3
Show file tree
Hide file tree
Showing 2 changed files with 79 additions and 38 deletions.
24 changes: 14 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,27 @@

## Instructions

1 - Download the [github2 Python library](http://packages.python.org/github2/).
1 - Install [github3 Python library](https://github.com/sigmavirus24/github3.py) using either

easy_install github3.py

2 - Install the [github2 Python library](http://packages.python.org/github2/install.html).
or

3 - Download the codepleximport.py script
pip install github3.py

4 - Open it in a text editor and modify the values at the top
2 - Download the codepleximport.py script

3 - Open it in a text editor and modify the values at the top

* CODEPLEX\_PROJECT - the subdomain of the project on codeplex _(yourproject.codeplex.com)_
* GITHUB\_PROJECT - the name of the repository on github _(Codeplex-Issues-Importer)_
* GITHUB\_USERNAME - your github username _(mendhak)_
* GITHUB\_ISSUELABEL - a tag that will be applied to all issues imported _(CodePlex)_
* GITHUB\_APITOKEN - your API Token which you can get from your [account admin page](https://github.com/account/admin)

CODEPLEX\_PROJECT - the subdomain of the project on codeplex _(yourproject.codeplex.com)_
GITHUB\_PROJECT - the name of the repository on github _(Codeplex-Issues-Importer)_
GITHUB\_USERNAME - your github username _(mendhak)_
GITHUB\_ISSUELABEL - a tag that will be applied to all issues imported _(CodePlex)_
GITHUB\_PASSWORD - your github password

5 - Run the script.

4 - Run the script.


## Notes
Expand Down
93 changes: 65 additions & 28 deletions codepleximport.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,39 @@
#! /usr/bin/python

import sys

#Download the github2 package from http://packages.python.org/github2/
# Set to true to force script to run in UTF8
forceUTF8 = False


if forceUTF8:
# Set default encoding to 'UTF-8' instead of 'ascii'
# http://stackoverflow.com/questions/11741574/how-to-set-the-default-encoding-to-utf-8-in-python
# Bad things might happen though
reload(sys)
sys.setdefaultencoding("UTF8")


# Used github3.py - https://github.com/sigmavirus24/github3.py
# install using easy_install github3.py
# or pip install github3.py

#User specific values

CODEPLEX_PROJECT = "gpslogger.codeplex.com"
GITHUB_PROJECT = "Codeplex-Issues-Importer"
GITHUB_USERNAME = "mendhak"
GITHUB_PASSWORD = ""
GITHUB_ISSUELABEL = "CodePlex"
GITHUB_APITOKEN = ""

if GITHUB_APITOKEN == "":
raise Exception("You haven't supplied an API Token in this file")


import urllib2
import HTMLParser
import re

try:
from github2.client import Github
from github3 import login
except ImportError, e:
print "You haven't installed the github2 package"
print "You haven't installed the github3 package"
exit()


Expand Down Expand Up @@ -57,7 +69,7 @@ def AppendDescription(self, d):

def SetSubmittedBy(self, s):
self.submittedby = s
self.description = "<b>" + s + "[CodePlex]</b><br />" + self.description
self.description = "<b>" + s + "[CodePlex]</b> <br />" + self.description

def AddComment(self, c):
self.comments.insert(0,c)
Expand Down Expand Up @@ -95,6 +107,7 @@ class WorkItemParser(HTMLParser.HTMLParser):
descriptionFound = False

commentByFound = False
commentAreaFound = False
commentFound = False
comment = ""

Expand All @@ -113,18 +126,26 @@ def __init__(self, url):

def handle_starttag(self, tag, attrs):

if tag == "span" and len(attrs) > 0:
spanId = getTupleValue(attrs, "id")
if spanId == "TitleLabel":
if tag == "h1" and len(attrs) > 0:
h1Id = getTupleValue(attrs, "id")
if h1Id == "workItemTitle":
self.titleFound = True

if tag == "p" and len(attrs) > 0:
pId = getTupleValue(attrs, "id")
if pId != None and "VotedLabel" in pId:
self.itemStatusFound = True

if tag == "div" and len(attrs) > 0:
divId = getTupleValue(attrs, "id")
if divId == "DescriptionPanel":
divClass = getTupleValue(attrs, "class")
#print " DivId: %s, DivClass:%s" %(divId, divClass)
if divId == "descriptionContent":
self.descriptionFound = True
elif divId != None and "MessageLabel" in divId:
elif divId != None and "CommentContainer" in divId:
self.commentAreaFound = True
elif self.commentAreaFound and divClass!= None and "markDownOutput" in divClass:
self.commentFound = True
elif divId != None and "VotedLabel" in divId:
self.itemStatusFound = True

if tag == "a" and len(attrs) > 0:
aId = getTupleValue(attrs, "id")
Expand All @@ -137,28 +158,38 @@ def handle_starttag(self, tag, attrs):
def handle_data(self, data):
if self.titleFound:
self.currentWorkItem.AppendHeading(data)
print "Title: %s" % (data)
if self.descriptionFound:
self.currentWorkItem.AppendDescription(data)
print "Description: %s" % (data)
if self.commentByFound:
self.comment = self.comment + "<b>" + data + "[CodePlex]</b>"
self.comment = "<b>" + data + "[CodePlex]</b> <br />" + self.comment
print "CommentBy: %s" % (data)
if self.commentFound:
self.comment = self.comment + data
print "Comment: %s" % (data)
if self.submittedByFound:
self.currentWorkItem.SetSubmittedBy(data)
print "SubmittedBy: %s" % (data)
if self.itemStatusFound:
if data == "closed":
if data == "Closed":
self.currentWorkItem.SetIsClosed(True)

print "CLOSED: %s" % (True)


def handle_endtag(self, tag):
if self.titleFound and tag == "span":
if self.itemStatusFound and tag == "p":
self.itemStatusFound = False
if self.titleFound and tag == "h1":
self.titleFound = False
if self.descriptionFound and tag == "div":
self.descriptionFound = False
if self.commentByFound and tag == "a":
self.commentByFound = False
if self.commentFound and tag == "div":
self.commentFound = False
self.commentAreaFound = False
self.currentWorkItem.AddComment(self.comment)
self.comment = ''
if self.submittedByFound:
Expand All @@ -167,8 +198,8 @@ def handle_endtag(self, tag):
def handle_entityref(self, name):
if self.descriptionFound:
self.currentWorkItem.AppendDescription(name)
if self.commentFound:
self.comment = self.comment + data
#if self.commentFound:
#self.comment = self.comment + data
if self.titleFound:
self.currentWorkItem.AppendHeading(name)

Expand Down Expand Up @@ -208,36 +239,42 @@ def handle_startendtag(self, tag, attrs):
print "Parsing page ", pageNumber, "(", issuePageUrl, ")"

ilp = IssuesListParser(issuePageUrl)

if len(ilp.itemLinks) == totalLinks:
continuePaging = False
print "Reached end of issue pages"
else:
totalLinks = len(ilp.itemLinks)
continuePaging = True

print len(ilp.itemLinks), " work item links found"
#print len(ilp.itemLinks), " work item links found"

parsedWorkItems = []


#Loop through, process each work item link
for itemUrl in ilp.itemLinks:
print "\n\nParsing %s" % (itemUrl)
wiParser = WorkItemParser(itemUrl)
print wiParser.currentWorkItem.heading
parsedWorkItems.append(wiParser.currentWorkItem)

print len(parsedWorkItems), " work items parsed from CodePlex"

#initialize github
gh = Github(username=GITHUB_USERNAME, api_token=GITHUB_APITOKEN,requests_per_second=0.5)
gh = login(GITHUB_USERNAME, password=GITHUB_PASSWORD)

for wi in parsedWorkItems:
newIssue = gh.issues.open(GITHUB_USERNAME + "/" + GITHUB_PROJECT, title=wi.heading, body=wi.description)
gh.issues.add_label(GITHUB_USERNAME + "/" + GITHUB_PROJECT, newIssue.number, GITHUB_ISSUELABEL)
print "gh.create_issue(%s,%s,%s,%s,labels=[%s])" % (GITHUB_USERNAME,GITHUB_PROJECT,wi.heading,wi.description,GITHUB_ISSUELABEL)
newIssue = gh.create_issue(GITHUB_USERNAME,GITHUB_PROJECT,wi.heading,wi.description,labels=[GITHUB_ISSUELABEL])
if not newIssue:
print "Unable to create issue"
continue

for c in wi.comments:
gh.issues.comment(GITHUB_USERNAME + "/" + GITHUB_PROJECT, newIssue.number, c)
newIssue.create_comment(c)
if wi.isClosed:
gh.issues.close(GITHUB_USERNAME + "/" + GITHUB_PROJECT, newIssue.number)
newIssue.close()
print "Created Github issue", newIssue.number, "for", "[" + wi.heading + "]"

print "End of script"

3 comments on commit 0e48da3

@fire-eggs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful script, especially with Microsoft shutting down Codeplex. I'm using your script to migrate a project to Github.

Two problems:

  1. Apparently github has added "abuse limits". I could only import 18 issues before github hit me with a 403 error. Github "best practices" suggests waiting at least a second between submissions.
  2. Given problem 1), it is difficult to start "where I left off". I don't know enough about the github API to feel confident that Github won't create duplicate issues if I start over from the beginning.

Minor nit: your web page still talks about the "old" version of the script (e.g. 'github2', etc).

Minor nit: identifying the script as Python 2.x (not 3.x) would have be helpful.

All this aside, thank you very much for this script!
Kevin

@mendhak
Copy link
Owner

@mendhak mendhak commented on 0e48da3 Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @fire-eggs - do you happen to have the changes you did to get things running, even minor? Maybe you could do a pull request then I could merge it in and other CodePlexers can benefit from it if they need.

I haven't touched or used this in 3 years so I've forgotten lots about this, so any modifications even minor ones would help others I'm sure.

I'm sorry to see Codeplex go, I lurk on some of the app-projects there. Did Codeplex email everyone to nudge them off or did they only do a banner/blog announcement?

@fire-eggs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll try to do a pull request soon - forgive me if I don't get it right!

Embarrassed to say it took a few tries before Github would stop complaining ...

Please sign in to comment.