Skip to content

Commit

Permalink
dont know state! uploading finished
Browse files Browse the repository at this point in the history
  • Loading branch information
Arunmozhi committed Jul 1, 2011
1 parent 825b07c commit 670196b
Show file tree
Hide file tree
Showing 3 changed files with 61 additions and 10 deletions.
8 changes: 8 additions & 0 deletions app.yaml
Expand Up @@ -3,6 +3,14 @@ version: 1
runtime: python runtime: python
api_version: 1 api_version: 1



handlers: handlers:
- url: /admin/remote_api
script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
login: admin

- url: .* - url: .*
script: main.py script: main.py

builtins:
- datastore_admin: on
43 changes: 43 additions & 0 deletions bulkloader.yaml
@@ -0,0 +1,43 @@
# Autogenerated bulkloader.yaml file.
# You must edit this file before using it. TODO: Remove this line when done.
# At a minimum address the items marked with TODO:
# * Fill in connector and connector_options
# * Review the property_map.
# - Ensure the 'external_name' matches the name of your CSV column,
# XML tag, etc.
# - Check that __key__ property is what you want. Its value will become
# the key name on import, and on export the value will be the Key
# object. If you would like automatic key generation on import and
# omitting the key on export, you can remove the entire __key__
# property from the property map.

# If you have module(s) with your model classes, add them here. Also
# change the kind properties to model_class.
python_preamble:
- import: base64
- import: re
- import: google.appengine.ext.bulkload.transform
- import: google.appengine.ext.bulkload.bulkloader_wizard
- import: google.appengine.ext.db
- import: google.appengine.api.datastore
- import: google.appengine.api.users

transformers:
- kind: Permission
connector: csv

property_map:
- property: __key__
external_name: key
export_transform: transform.key_id_or_name_as_string

- property: id
external_name: id
export_transform: transform.key_id_or_name_as_string

- property: link
external_name: link

- property: text
external_name: text
import_transform: db.Text
20 changes: 10 additions & 10 deletions main.py
Expand Up @@ -25,11 +25,12 @@




from google.appengine.ext import db from google.appengine.ext import db

class Test(db.Model):
lis = db.TextProperty()


class Hook(db.Model): class Hook(db.Model):
text = db.TextProperty() text = db.TextProperty()
page = db.LinkProperty() page = db.StringProperty()
category = db.StringProperty() # Remove Me later category = db.StringProperty() # Remove Me later
#projects = db.ListProperty(db.key, default=None) #projects = db.ListProperty(db.key, default=None)


Expand All @@ -42,21 +43,20 @@ def get(self):
f = urlfetch.fetch("http://en.wikipedia.org/wiki/Wikipedia:Recent_additions") f = urlfetch.fetch("http://en.wikipedia.org/wiki/Wikipedia:Recent_additions")
soup = BeautifulSoup.BeautifulSoup(f.content) soup = BeautifulSoup.BeautifulSoup(f.content)
lis = soup.findAll("li", attrs={"style":"-moz-float-edge: content-box"}) lis = soup.findAll("li", attrs={"style":"-moz-float-edge: content-box"})
tet = Test()
tet.lis = lis.__str__()
tet.put()
for li in lis: for li in lis:
dbhook = Hook()
try: try:
link = li.b.a["href"] link = li.b.a["href"]
except TypeError: except TypeError:
link = li.find("a")["href"] link = li.find("a")["href"]
link = link.replace("/wiki/","") dbhook.text = str(li)
hook = { "text" : li.text, "link" : link } dbhook.page = link.replace("/wiki/","")
#store it in DB
dbhook = Hook()
dbhook.text = hook["text"]
dbhook.page = "http://en.wikipedia.org/wiki/"+hook["link"]
dbhook.category = "June" dbhook.category = "June"
dbhook.put() dbhook.put()
self.response.out.write(unicode(hook["link"])) self.response.out.write("Done!")
self.response.out.write("<br />")




def main(): def main():
Expand Down

0 comments on commit 670196b

Please sign in to comment.