Skip to content

poiu1235/weibo-catch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

weibo-catch

if you want to use this code running in your computer you need a computer with a ubuntu operation system and mysql db management system

you need create database by use the create db code(in mysql)

you must create two project and put this code as a copy. the one named weibo-catch(use to catch weibo profile),another named weibo-catch-user(use to catch follow), then two program run together.

1、change the file 'WeiboCatch.py' findweibo(cls,sweb) like follows:

//if your function is catch-follow then delete the "$$" before t1 and append(t1) line

//if your function is catch-profile then delete the "$$" before t2 and append(t2) line

#------------------------------------------------------------------------------------------

$$t1 = threading.Thread(target=obj.catchfollow,args=("/"+userid+"/follow",))

$$threads.append(t1)

$$2 = threading.Thread(target=obj.catchprofile,args=("/"+userid+"/profile",))

$$threads.append(t2)

#---------------------------------------------------------------------------------------------

2、change the 'ConnSQL.py' def getuserid(self) and def setcompleteid(self,userid) like follows:

#----------------------------------------------------------------------------------------------

//if your function is catch-profile then use follow sql replace the code part

cursor.execute("SELECT * FROM weibocatch.w_user where (recon=0 or (recon=1 and color=0)) and flag=1 order by inserttime limit 1")

//if your function is catch-follow then use follow sql replace the code part

cursor.execute("SELECT * FROM weibocatch.w_user where (recon=0 or (recon=1 and color=0)) and flag=0 order by inserttime limit 1")

//if your function is catch-profile then use follow sql replace the code part

cursor.execute("UPDATE weibocatch.w_user SET flag = 2 WHERE wid = %s",(userid,))

//if your function is catch-follow then use follow sql replace the code part

cursor.execute("UPDATE weibocatch.w_user SET flag = 1 WHERE wid = %s",(userid,))

#-----------------------------------------------------------------------------------------------

why don't use multiprocess?

because at the beginning, many variable I have used have the same name. And at that time, I overlook the suitable situation that the global variable should be used,if I want to use multiprocess this time, I will have enormous work in all code reconstruction.

Don't worry, I will do this tough work in the future. but here I must appologise, I can't. I only ensure the function can run correctly.

you need create a file named 'localconfig' used to record the account information and its format like follows:

#------------------------------------

[weibo]

weibouser=xxxxxxxx

weibopwd=xxxxxxxx

[db]

dbsite=xxxxxxxx

dbuser=xxxxxxxx

dbpwd=xxxxxxxx

[email]

emailuser=xxxxxxxx

emailpwd=xxxxxxxx

emailhost=smtp.163.com #xxxxxxxx

emailrec=xxxxxxxx

#------------------------------------- project describtion: The weibo-catch is a kind of spider in Weibo(Sina). I want to catch two kind of data from weibo: one is follows,which is people interested in some people,cooperation,or field. another is what he says,commits,conveys and shares. so I keep two thread to do the two kind of jobs.

beacause of the enormous imformation of weibo creates everyday, I always keep the program running. and it will Loop traversal your follows list, and catch them information.

all my data are stored in mysql database. I will give the table stacture and the creating code

in the future, I will catch them into other two kind of no-sql database redis and mongodb, in order to compare their capacity in store big data.

finally I will import Hbase and hadoop to count or do some statistic at what i am interesting.

all this have done already, I will open one website to running this huge information for free.

finally my name is Luis, I come from China in Asia. my e-mail is yuyi304738837@163.com or 304738837@qq.com, I am a master candidate in HIT, welcome to contact me and discuss.

About

weibo(sina) catcher

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages