Permalink
Browse files

Merge branch 'master' of github.com:bibanon/bibanon

Conflicts:
	Bibanon/tools/BA-4chan-Archiver/src/4chan-html2json
	Bibanon/tools/BA-4chan-Archiver/the-chandler
  • Loading branch information...
Lawrence
Lawrence committed Mar 15, 2013
2 parents 257862f + 3617fe1 commit 440f75da4d42ab2cdd82e166eb98741e91254bf7
Showing with 139 additions and 158 deletions.
  1. 0 Bibanon/tools/{ → 4chan-downloader}/4chan-Thread-Grabber.md
  2. 0 Bibanon/tools/{ → 4chan-downloader}/4chanig.py
  3. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/Example/4chan-g-28885080.json
  4. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/README.md.txt
  5. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/example/4chan-g-29127693.json
  6. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/example/archive.py
  7. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353111250000.png
  8. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353112221337.jpg
  9. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353112262990.png
  10. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353112994738.png
  11. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353113128942.jpg
  12. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353114240127.png
  13. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353114286128.png
  14. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353114334243.png
  15. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353115645733.png
  16. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353115972892.png
  17. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116006976.png
  18. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116208234.png
  19. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116287210.png
  20. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116378158.jpg
  21. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116420009.png
  22. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116587212.jpg
  23. BIN ...n/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/images/1353116826952.png
  24. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/example/nexus4/g-29127693/thread.js
  25. +16 −0 Bibanon/tools/4chan-downloader/grab-thread.sh
  26. +1 −0 Bibanon/tools/4chan-downloader/src/4chandownloader
  27. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/src/CSS/burichan.css
  28. 0 Bibanon/tools/{BA-4chan-Archiver → 4chan-downloader}/src/CSS/futaba.css
  29. +0 −1 Bibanon/tools/4chan-image-dl
  30. +0 −63 Bibanon/tools/BA-4chan-Archiver/4chan-Thread-Grabber.md
  31. +0 −84 Bibanon/tools/BA-4chan-Archiver/4chanig.py
  32. +16 −0 Bibanon/tools/BA-4chan-Archiver/README.md
  33. +5 −5 Bibanon/tools/BA-4chan-Archiver/grab-thread.sh
  34. +100 −0 Bibanon/tools/BA-4chan-Archiver/img-downloader.py
  35. +0 −1 Bibanon/tools/BA-4chan-Archiver/src/4chan-html2json
  36. +0 −1 Bibanon/tools/BA-4chan-Archiver/src/4chan2json
  37. +0 −1 Bibanon/tools/BA-4chan-Archiver/src/pure
  38. +0 −1 Bibanon/tools/BA-4chan-Archiver/src/py-4chan
  39. +1 −1 Books/Guides/Newfags-Guide/Botnets.md
@@ -0,0 +1,16 @@
+#!/bin/bash
+
+$tnumbertnumber=`echo $1 | cut -c32-`
+
+mkdir $tnumber
+cd $tnumber
+
+while :
+ do
+ wget -e robots=off -E -nd -nc -np -r -k -H -D images.4chan.org,thumbs.4chan.org $1
+ cp $tnumber.html index.html
+ sleep 10
+ done
+#backup
+#echo "thread number?"
+#read tnumber#mkdir
Submodule 4chandownloader added at f68b92
Submodule 4chan-image-dl deleted from c1501f
@@ -1,63 +0,0 @@
-With the introduction of 4chan's JSON API, archiving threads has never been easier!
-
-## How to archive threads by hand
-
-If you're on Linux, See "Getting Prettified JSON". This will give you the raw thread in JSON format, which can then be transformed into other things.
-
-Hopefully, we will be able to convert previously saved html into the new JSON format.
-
-*[Adevore's 4chan Archiver](https://github.com/adevore/4chan-archiver) offers the best solution, using BeautifulSoup to save threads and images. Just install `python3-bs4` on debian/ubuntu. It, however, is now a roundabout solution with the advent of the 4chan API.
-
-## Automatic archiver
-
-If Fuuka is too big for you, here is a smaller chan-archiver made in PHP:
-
-https://github.com/emoose/chan-archiver/
-
-Here is a python wrapper for the 4chan API:
-
-https://github.com/e000/py-4chan
-
-### Saving images
-
-The JSON also gives an MD5 hash for every image, in case the original was not saved and needs to be retrieved.
-
-## Format
-
-The format basically corresponds to the conventional 4chan link, with the addition of `.json` at the end and the `api` subdomain in the beginning.
-
-In the examples below, subsitute the `<board>` tag with the board acronym (ex. `lit` for Literature) and the `<thread-id>` tag with the id of the OP post, which can be found in the original HTML link.
-
-Conventional html link:
-
- http://boards.4chan.org/<board>/res/<thread-id>
-
-JSON API link:
-
- http://api.4chan.org/<board>/res/<thread-id>.json
-
-## Getting Prettified JSON
-
-Using python:
-
- curl http://api.4chan.org/<board>/res/<thread-id>.json | python -mjson.tool
-
-Using perl:
-
- curl http://api.4chan.org/<board>/res/<thread-id>.json | json_pp
-
-## Converting JSON to Markdown
-
-## Saving Images
-
-Here is a node.js script to gather full images from a certain thread until it is no longer reachable.
-
-https://github.com/ypocat/4chan
-
-The best python script for image grabbing. Also takes in times.
-
-https://github.com/crypt3lx2k/4chan-Image-Scraper
-
-Here is an equivalent python script that also watches, and is extensible to any chan:
-
-https://github.com/lunanoko/4chan-image-downloader
@@ -1,84 +0,0 @@
-#!/usr/local/bin/python
-
-# 4chan image grabber.
-# Usage 4chanig.py [board] [thread number]"
-# Written by clizana
-# cristian@lizana.in
-
-# Some functions, I love functions.
-
-def in_array(array, needle):
- for item in array:
- if(item==needle.lower()):
- return True
- return False
-
-def number_files(array):
- number = 0
- for element in array['posts']:
- if element.has_key('filename'):
- number += 1
- return number
-
-def download_file(url, file_name, curr_file, num_files):
- u = urllib2.urlopen(url)
- f = open(file_name, 'wb')
- meta = u.info()
- file_size = int(meta.getheaders("Content-Length")[0])
- print "Downloading file %s of %s (File Size: %s Bytes)" % (curr_file, num_files, file_size)
-
- file_size_dl = 0
- block_sz = 8192
- while True:
- buffer = u.read(block_sz)
- if not buffer:
- break
-
- file_size_dl += len(buffer)
- f.write(buffer)
- status = r"%10d [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)
- status = status + chr(8)*(len(status)+1)
- print status,
-
- f.close()
-
-# What we need.
-import urllib, urllib2, simplejson, sys, os
-
-# Valid Boards
-BOARD_LIST = ['a', 'b' , 'c' , 'd' , 'e' , 'f' , 'g' , 'gif' , 'h' , 'hr' , 'k' , 'm' , 'o' , 'p' , 'r' , 's' , 't' , 'u' , 'v' , 'vg' , 'w' , 'wg', 'i' , 'ic', 'r9k', 'cm' , 'hm' , 'y', '3' , 'adv' , 'an' , 'cgl' , 'ck' , 'co' , 'diy' , 'fa' , 'fit' , 'hc' , 'int' , 'jp' , 'lit' , 'mlp' , 'mu' , 'n' , 'po' , 'pol' , 'sci' , 'soc' , 'sp' , 'tg' , 'toy' , 'trv' , 'tv' , 'vp' , 'wsg' , 'x', 'q']
-
-if len(sys.argv)==3 and (sys.argv[1]!="" and sys.argv[2]!="" and sys.argv[2].isdigit()):
- board = sys.argv[1]
- thread = sys.argv[2]
- IMG_PATH = "http://images.4chan.org/%s/src/" % (board)
- JSON_URL = "http://api.4chan.org/%s/res/%s.json" % (board, thread)
-
- #We check some stuff (maybe not too efficient)
- try:
- if not os.path.exists(thread):
- os.mkdir(thread)
- except:
- print "Error: creating the output dir"
- exit()
-
- if in_array(BOARD_LIST, board)==False:
- print "Error: The board doesn't exist"
- exit()
- try:
- results = simplejson.load(urllib.urlopen(JSON_URL))
- current_file = 1
- total_files = number_files(results)
- for result in results['posts']:
- if result.has_key('filename'):
- if os.path.isfile("%s/%s%s" % (thread, result['filename'], result['ext']))==False:
- download_file("%s%s%s" % (IMG_PATH, result['tim'], result['ext']), "%s/%s%s" % (thread, result['filename'], result['ext']), "%s" % (current_file), "%s" % (total_files))
- #Using the urllib method without progress.
- #urllib.urlretrieve ("%s%s%s" % (IMG_PATH, result['tim'], result['ext']), "%s/%s%s" % (thread, result['filename'], result['ext']))
- current_file = current_file + 1
- except:
- print "The thread number is wrong or the thread doesn't exist anymore"
-else:
- print "Usage 4chanthread.py [board] [thread number]"
-
-
@@ -0,0 +1,16 @@
+A quick and dirty python-based image downloader using the 4chan API for better, faster, and machine parsable 4chan thread archives.
+
+Supports Mac OS X and Linux (make sure that Python 2.x is installed.)
+
+Dependencies: Python 2.x, Bash
+
+* [The Chandler - PyQt4 GUI interface for downloading images.](https://github.com/Dhole/4chan-image-dl.git)
+* [img-downloader.py - Command line script.](https://github.com/socketubs/4chandownloader.git)
+* grab-thread.sh - Just repeats the command:
+ curl http://api.4chan.org/$boardid/res/$tnumber.json | python2 -mjson.tool > b-$tnumber.json
+
+## Improvements:
+
+* Windows version (by integrating into Chandler?)
+* Integrate thread json grabber into The Chandler
+* Create a json to html viewer for json posts, using Yotsuba CSS
View
@@ -1,14 +1,14 @@
#!/bin/bash
-$tnumbertnumber=`echo $1 | cut -c32-`
+boardid="$1"
+tnumber="$2"
-mkdir $tnumber
-cd $tnumber
+echo "$boardid $tnumber"
+echo "Starting thread text archival..."
while :
do
- wget -e robots=off -E -nd -nc -np -r -k -H -D images.4chan.org,thumbs.4chan.org $1
- cp $tnumber.html index.html
+ curl http://api.4chan.org/$boardid/res/$tnumber.json | python2 -mjson.tool > b-$tnumber.json
sleep 10
done
#backup
@@ -0,0 +1,100 @@
+#!/usr/bin/env python
+# coding: utf-8
+
+#
+# Initial release Nov. 5, 2009
+# v6 release Jan. 20, 2009
+# http://cal.freeshell.org
+#
+# Refactor, update and Python package
+# by Socketubs (http://socketubs.net/)
+# 09-08-12
+#
+
+import os
+import time
+import json
+from docopt import docopt
+import requests
+
+doc = """4chandownloader.py, download 4chan thread images.
+
+Usage:
+ 4chandownloader.py <url> <path> [--delay=<int>] [--thumbs]
+ 4chandownloader.py -h | --help
+ 4chandownloader.py -v | --version
+
+Options:
+ --thumbs Download thumbnails
+ --delay=<int> Delay between thread checks [default: 20]
+ -h --help Show help
+ -v --version Show version
+"""
+
+def main(args):
+ thread = args.get('<url>').split('/')[5]
+ board = args.get('<url>').split('/')[3]
+ path = args.get('<path>')
+ thumbs = args.get('--thumbs', False)
+ delay = args.get('--delay')
+
+ #Start
+ while 1:
+ r = requests.get('https://api.4chan.org/%s/res/%s.json' % (board, thread))
+
+ if not os.path.exists(path):
+ os.makedirs(path)
+ if not os.path.exists(os.path.join(path, board)):
+ os.makedirs(os.path.join(path, board))
+ if not os.path.exists(os.path.join(path, board, thread)):
+ os.makedirs(os.path.join(path, board, thread))
+ if thumbs:
+ if not os.path.exists(os.path.join(path, board, thread, 'thumbs')):
+ os.makedirs(os.path.join(path, board, thread, 'thumbs'))
+
+ print(' :: Board: %s' % board)
+ print(' :: Thread: %s' % thread)
+
+ dst = os.path.join(path, board, thread)
+ dst_thumbs = os.path.join(path, board, thread, 'thumbs')
+ for post in r.json['posts']:
+ if post.get('filename', False):
+ if post.get('filedeleted', False):
+ continue
+ file_name = '%s%s' % (post['tim'], post['ext'])
+ file_path = os.path.join(dst, file_name)
+ file_url = 'https://images.4chan.org/%s/src/%s' % (board, file_name)
+
+ if not os.path.exists(file_path):
+ print('%s downloading...' % file_name)
+ i = requests.get(file_url)
+ if i.status_code == 404:
+ print(' | Failed, try later (%s)' % file_url)
+ else:
+ open(file_path, 'w').write(i.content)
+ else:
+ print('%s already downloaded' % file_name)
+
+ if thumbs:
+ thumb_name = '%ss.jpg' % post['tim']
+ thumb_path = os.path.join(dst_thumbs, thumb_name)
+ thumb_url = 'https://thumbs.4chan.org/%s/thumb/%s' % (board, thumb_name)
+ if not os.path.exists(thumb_path):
+ print('%s (thumb) downloading...' % thumb_name)
+ i = requests.get(thumb_url)
+ if i.status_code == 404:
+ print(' | Failed, try later (%s)' % thumb_url)
+ else:
+ open(thumb_path, 'w').write(i.content)
+ else:
+ print('%s (thumb) already downloaded' % thumb_name)
+
+ json.dump(r.json, open(os.path.join(dst, '%s.json' % thread), 'w'))
+
+ #Wait to execute code again
+ print("Waiting %s seconds before retrying" % delay)
+ time.sleep(int(delay))
+
+if __name__ == '__main__':
+ args = docopt(doc, version=0.3)
+ main(args)
Submodule 4chan-html2json deleted from af2a36
Submodule 4chan2json deleted from af2a36
Submodule pure deleted from 5e0907
Submodule py-4chan deleted from cb3f9c
@@ -1,4 +1,3 @@
-
# How to Make a Botnet #
The information on this page is provided for educational purposes only. The actual practices this page describes are illegal in most countries, and will violate most hosts' terms of service. If you actually intend to do any of these things, you should probably set up your own server, retain a good lawyer, and buy some anal lube for when you get sent to prison. Anonymous who ever the fuck is hosting this wiki, and the article authors are not responsible for your own stupidity.
@@ -29,6 +28,7 @@ About the anal lube part only and a good lawyer is pure bullshit because if you
# Uploading #
Ok, now you know what you need, and what you should have, lets get to the real business ; - )
+
1. Create a folder.
2. Copy index.php, online.php, update.php, update.txt.
3. Paste it to the folder that you made.

0 comments on commit 440f75d

Please sign in to comment.