Skip to content

BlasphemyAngels/CrawlFickrData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

爬取flickr图像描述的爬虫

Flickr数据集

   Flickr是一个图像数据集,其中有很多图像,每张图像有五条描述。Flickr有两个数据集:

  • Flickr8k
  • Flickr30k

   数据样例如下:

   图像:

exp

   描述:

  • A boy sand surfing down a hill
  • A man is attempting to surf down a hill made of sand on a sunny day.
  • A man is sliding down a huge sand dune on a sunny day.
  • A man is surfing down a hill of sand.
  • A young man in shorts and t-shirt is snowboarding under a bright blue sky.

   现在就爬取Flickr8k的数据集。

正文

   数据所在网站:Flickr8k

环境

  • python
  • scrapy

运行

   进入第一层CrawlFlickr目录,此时所在目录跟文件scrapy.cfg同级,然后运行代码scrapy crawl ImageText即可。

原理

   关于scrapy的内容看博客scrapy初涉即可。

About

爬取Flickr图像描述数据的爬虫

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages