Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Support for Django documentation [and others?]

I've generalized the spider.py script to support most HTML documentation
out there and I've updated the plug-in and index file to support the
Django documentation out of the box.
  • Loading branch information...
commit 2982ea5c73d78d7d2b39a128c0f61a59687d1c45 1 parent 5bbea15
@xolox authored
Showing with 13,852 additions and 8,856 deletions.
  1. +22 −8 README.md
  2. +1 −1  TODO.md
  3. +58 −47 autoload.vim
  4. +13,630 −8,730 index
  5. +25 −4 pyref.vim
  6. +116 −66 spider.py
View
30 README.md
@@ -1,12 +1,16 @@
# Context-sensitive documentation <br> for Python source code in Vim
-The `pyref.vim` script is a plug-in for the [Vim text editor](http://www.vim.org/) that looks up keywords and identifiers in the [Python language reference](http://docs.python.org/reference/index.html) and [library reference](http://docs.python.org/library/index.html) documentation using your web browser. The `:PyRef` command looks up the identifier given as an argument while the `<F1>` mapping looks up the item at the text cursor. Both are only made available inside Python buffers. The lookup works by scanning through a special index file which is included in the ZIP archive below, but you can also create/update the index yourself using the Python script [spider.py](http://github.com/xolox/vim-pyref/blob/master/spider.py).
+The `pyref.vim` script is a plug-in for the [Vim text editor](http://www.vim.org/) that helps you look up the documentation for keywords and identifiers from the following sources using your web browser:
-## Install & usage
+ * [Python language reference](http://docs.python.org/reference/)
+ * [Python library reference](http://docs.python.org/library/)
+ * [Django documentation](http://docs.djangoproject.com/)
+
+The `:PyRef` command looks up the identifier given as an argument while the `<F1>` mapping (only available in Python buffers) looks up the item under the text cursor. The lookup works by scanning through a special index file which is included in the ZIP archive below, but you can also create/update the index yourself using the Python script [spider.py](http://github.com/xolox/vim-pyref/blob/master/spider.py).
-Unzip the most recent [ZIP archive](http://peterodding.com/code/vim/downloads/pyref) file inside your Vim profile directory (usually this is `~/.vim` on UNIX and `%USERPROFILE%\vimfiles` on Windows), restart Vim and execute the command `:helptags ~/.vim/doc` (use `:helptags ~\vimfiles\doc` instead on Windows). Now try it out: Open a Python script and press the `<F1>` key on something interesting.
+## Install & usage
-The following paragraphs explain the available options:
+Unzip the most recent [ZIP archive](http://peterodding.com/code/vim/downloads/pyref) file inside your Vim profile directory (usually this is `~/.vim` on UNIX and `%USERPROFILE%\vimfiles` on Windows), restart Vim and execute the command `:helptags ~/.vim/doc` (use `:helptags ~\vimfiles\doc` instead on Windows). Now try it out: Open a Python script and press the `<F1>` key on something interesting. If it doesn't work or you want to change how it works, see the options documented below.
### The `g:pyref_mapping` option
@@ -16,13 +20,23 @@ If you press `<F1>` and nothing happens you're probably using a terminal that do
Note that setting `g:pyref_mapping` won't change the key mapping in existing buffers.
-### The `g:pyref_mirror` option
+### The `g:pyref_python` option
-This option is useful when you don't always have a reliable internet connection available while coding. Most Linux distributions have an installable package containing the Python documentation, for example on Ubuntu and Debian you can execute the following command to install the documentation:
+This option is useful when you don't always have a reliable internet connection available while coding. Most Linux distributions have an installable package containing the Python documentation, for example on [Ubuntu](http://packages.ubuntu.com/python2.6-doc) and [Debian](http://packages.debian.org/python2.6-doc) you can execute the following command to install the documentation:
$ sudo apt-get install python2.6-doc
-The above package puts the documentation in `/usr/share/doc/python2.6/html/` which happens to be the default location checked by the `pyref.vim` script. If you've installed the documentation elsewhere you can change the global variable `g:pyref_mirror` accordingly.
+The above package puts the documentation in `/usr/share/doc/python2.6/html/` which happens to be the default path checked by the `pyref.vim` script. If you've installed the documentation in a different location you can change the global variable `g:pyref_python`, e.g.:
+
+ :let g:pyref_python = $HOME . '/docs/python'
+
+### The `g:pyref_django` option
+
+This option works like `g:pyref_python` but allows you to configure the path to your local Django documentation. On [Ubuntu](http://packages.ubuntu.com/python-django-doc) and [Debian](http://packages.debian.org/python-django-doc) you can execute the following command to install the Django documentation:
+
+ $ sudo apt-get install python-django-doc
+
+In this case you shouldn't have to change anything because `pyref.vim` is already configured to be compatible with the `python-django-doc` package.
### The `g:pyref_index` option
@@ -34,7 +48,7 @@ You can change the above options permanently by putting the relevant `:let` stat
## Contact
-If you have questions, bug reports, suggestions, etc. the author can be contacted at <peter@peterodding.com>. The latest version is available at <http://peterodding.com/code/vim/pyref/> and <http://github.com/xolox/vim-pyref>. If you like the script please vote for it on [www.vim.org](http://www.vim.org/scripts/script.php?script_id=3104).
+If you have questions, bug reports, suggestions, etc. the author can be contacted at <peter@peterodding.com>. The latest version is available at <http://peterodding.com/code/vim/pyref/> and <http://github.com/xolox/vim-pyref>. If you like the script please vote for it on [Vim Online](http://www.vim.org/scripts/script.php?script_id=3104).
## License
View
2  TODO.md
@@ -1,3 +1,3 @@
# To-do list
- * Convert `pyref.vim` to an autoload plug-in?
+ * Switch to `python-doc` package instead of `python2.6-doc`?
View
105 autoload.vim
@@ -1,12 +1,11 @@
" Vim auto-load script
" Author: Peter Odding <peter@peterodding.com>
-" Last Change: September 18, 2010
+" Last Change: December 19, 2010
" URL: http://peterodding.com/code/vim/pyref/
let s:script = expand('<sfile>:p:~')
function! xolox#pyref#enable() " {{{1
- command! -buffer -nargs=? PyRef call xolox#pyref#lookup(<q-args>)
let command = '%s <silent> <buffer> %s %s:call xolox#pyref#at_cursor()<CR>'
let mapping = exists('g:pyref_mapping') ? g:pyref_mapping : '<F1>'
execute printf(command, 'nmap', mapping, '')
@@ -26,52 +25,60 @@ function! xolox#pyref#at_cursor() " {{{1
call xolox#pyref#lookup(ident)
endfunction
+function! xolox#pyref#complete(arglead, cmdline, cursorpos) " {{{1
+ let entries = map(s:read_index(), 'matchstr(v:val, ''^\S\+'')')
+ let pattern = xolox#escape#pattern(a:arglead)
+ call filter(entries, 'v:val =~ pattern')
+ if len(entries) > &lines
+ let entries = entries[0 : &lines - 1]
+ call add(entries, '...')
+ endif
+ return entries
+endfunction
+
function! xolox#pyref#lookup(identifier) " {{{1
- let mirror = s:find_mirror()
let ident = xolox#trim(a:identifier)
" Do something useful when there's nothing at the current position.
if ident == ''
- call xolox#open#url(mirror . '/contents.html')
+ call s:show_match('http://docs.python.org/contents.html')
return
endif
" Escape any dots in the expression so it can be used as a pattern.
let pattern = substitute(ident, '\.', '\\.', 'g')
+ let lines = s:read_index()
" Search for an exact match of a module name or identifier in the index.
- let indexfile = s:find_index()
- try
- let lines = readfile(indexfile)
- catch
- let lines = []
- call xolox#warning("%s: Failed to read index file! (%s)", s:script, indexfile)
- endtry
- if s:try_lookup(lines, mirror, '^\C\(module-\|exceptions\.\)\?' . pattern . '\t')
+ if s:try_lookup(lines, '^\C\(module-\|exceptions\.\)\?' . pattern . '\t')
return
endif
" Search for a substring match on word boundaries.
- if s:try_lookup(lines, mirror, '\C\<' . pattern . '\>.*\t')
+ if s:try_lookup(lines, '\C\<' . pattern . '\>.*\t')
return
endif
" Try to match a method name of one of the standard Python types: strings,
" lists, dictionaries and files (not exactly ideal but better than nothing).
- for [url, method_pattern] in s:object_methods
+ for [url, method_pattern] in [
+ \ ['library/stdtypes.html#str.%s', '\C\.\@<=\(capitalize\|center\|count\|decode\|encode\|endswith\|expandtabs\|find\|format\|index\|isalnum\|isalpha\|isdigit\|islower\|isspace\|istitle\|isupper\|join\|ljust\|lower\|lstrip\|partition\|replace\|rfind\|rindex\|rjust\|rpartition\|rsplit\|rstrip\|split\|splitlines\|startswith\|strip\|swapcase\|title\|translate\|upper\|zfill\)$'],
+ \ ['tutorial/datastructures.html#more-on-lists', '\C\.\@<=\(append\|count\|extend\|index\|insert\|pop\|remove\|reverse\|sort\)$'],
+ \ ['library/stdtypes.html#dict.%s', '\C\.\@<=\(clear\|copy\|fromkeys\|get\|has_key\|items\|iteritems\|iterkeys\|itervalues\|keys\|pop\|popitem\|setdefault\|update\|values\)$'],
+ \ ['library/stdtypes.html#file.%s', '\C\.\@<=\(close\|closed\|encoding\|errors\|fileno\|flush\|isatty\|mode\|name\|newlines\|next\|read\|readinto\|readline\|readlines\|seek\|softspace\|tell\|truncate\|write\|writelines\|xreadlines\)$']]
let method = matchstr(ident, method_pattern)
if method != ''
if url =~ '%s'
let url = printf(url, method)
endif
- call xolox#open#url(mirror . '/' . url)
+ call s:show_match('http://docs.python.org/' . url)
return
endif
endfor
" Search for a substring match in the index.
- if s:try_lookup(lines, mirror, '\C' . pattern . '.*\t')
+ if s:try_lookup(lines, '\C' . pattern . '.*\t')
return
endif
@@ -83,57 +90,51 @@ function! xolox#pyref#lookup(identifier) " {{{1
while len(parts) > 1
call remove(parts, 0)
let pattern = '\C\<' . join(parts, '\.') . '$'
- if s:try_lookup(lines, mirror, pattern)
+ if s:try_lookup(lines, pattern)
return
endif
endwhile
- " As a last resort, search all of http://docs.python.org/ using Google.
- call xolox#open#url('http://google.com/search?btnI&q=inurl:docs.python.org/+' . ident)
+ " As a last resort, try Google's "I'm Feeling Lucky" search.
+ call xolox#open#url('http://google.com/search?btnI&q=python+' . ident)
endfunction
-" This list of lists contains [url_format, method_pattern] pairs that are used
-" to recognize calls to methods of objects that are one of Python's standard
-" types: strings, lists, dictionaries and file handles.
-let s:object_methods = [
- \ ['library/stdtypes.html#str.%s', '\C\.\@<=\(capitalize\|center\|count\|decode\|encode\|endswith\|expandtabs\|find\|format\|index\|isalnum\|isalpha\|isdigit\|islower\|isspace\|istitle\|isupper\|join\|ljust\|lower\|lstrip\|partition\|replace\|rfind\|rindex\|rjust\|rpartition\|rsplit\|rstrip\|split\|splitlines\|startswith\|strip\|swapcase\|title\|translate\|upper\|zfill\)$'],
- \ ['tutorial/datastructures.html#more-on-lists', '\C\.\@<=\(append\|count\|extend\|index\|insert\|pop\|remove\|reverse\|sort\)$'],
- \ ['library/stdtypes.html#dict.%s', '\C\.\@<=\(clear\|copy\|fromkeys\|get\|has_key\|items\|iteritems\|iterkeys\|itervalues\|keys\|pop\|popitem\|setdefault\|update\|values\)$'],
- \ ['library/stdtypes.html#file.%s', '\C\.\@<=\(close\|closed\|encoding\|errors\|fileno\|flush\|isatty\|mode\|name\|newlines\|next\|read\|readinto\|readline\|readlines\|seek\|softspace\|tell\|truncate\|write\|writelines\|xreadlines\)$']]
-
-function! s:try_lookup(lines, mirror, pattern) " {{{1
+function! s:try_lookup(lines, pattern) " {{{1
call xolox#debug("%s: Trying to match pattern %s", s:script, a:pattern)
let index = match(a:lines, a:pattern)
if index >= 0
let url = split(a:lines[index], '\t')[1]
- call xolox#open#url(a:mirror . '/' . url)
+ call s:show_match(url)
return 1
endif
endfunction
-function! s:find_mirror() " {{{1
- if exists('g:pyref_mirror')
- return g:pyref_mirror
- else
- let local_mirror = '/usr/share/doc/python2.6/html'
- if isdirectory(local_mirror)
- return 'file://' . local_mirror
- else
- return 'http://docs.python.org'
- endif
+function! s:show_match(url) " {{{1
+ let python_docs = s:get_option('pyref_python')
+ let django_docs = s:get_option('pyref_django')
+ let url = a:url
+ if url =~ '^http://docs\.python\.org/' && isdirectory(python_docs)
+ let url = substitute(url, '^http://docs\.python\.org', 'file://' . python_docs, '')
+ elseif url =~ '^http://docs\.djangoproject\.com/en/1\.1/' && isdirectory(django_docs)
+ let url = substitute(url, '/#', '.html#', '')
+ let url = substitute(url, '^http://docs\.djangoproject\.com/en/1\.1', 'file://' . django_docs, '')
endif
+ call xolox#open#url(url)
endfunction
-function! s:find_index() " {{{1
- if exists('g:pyref_index')
- let index = g:pyref_index
- elseif xolox#is_windows()
- let index = '~/vimfiles/misc/pyref_index'
+function! s:get_option(name) " {{{1
+ if exists('b:' . a:name)
+ return eval('b:' . a:name)
+ elseif exists('g:' . a:name)
+ return eval('g:' . a:name)
else
- let index = '~/.vim/misc/pyref_index'
+ return ""
endif
- let abspath = fnamemodify(index, ':p')
+endfunction
+
+function! s:find_index() " {{{1
+ let abspath = fnamemodify(g:pyref_index, ':p')
if !filereadable(abspath)
let msg = "%s: The index file doesn't exist or isn't readable! (%s)"
call xolox#warning(msg, s:script, index)
@@ -142,4 +143,14 @@ function! s:find_index() " {{{1
return abspath
endfunction
+function! s:read_index() " {{{1
+ let indexfile = s:find_index()
+ try
+ return readfile(indexfile)
+ catch
+ call xolox#warning("%s: Failed to read index file! (%s)", s:script, indexfile)
+ return []
+ endtry
+endfunction
+
" vim: ts=2 sw=2 et nowrap
View
22,360 index
13,630 additions, 8,730 deletions not shown
View
29 pyref.vim
@@ -1,9 +1,9 @@
" Vim plug-in
" Author: Peter Odding <peter@peterodding.com>
-" Last Change: September 18, 2010
+" Last Change: December 19, 2010
" URL: http://peterodding.com/code/vim/pyref/
" License: MIT
-" Version: 0.6
+" Version: 0.7
" Support for automatic update using the GLVS plug-in.
" GetLatestVimScripts: 3104 1 :AutoInstall: pyref.zip
@@ -15,10 +15,31 @@ else
let g:loaded_pyref = 1
endif
-" Automatic command to enable plug-in for Python buffers only.
+" Default location of index file, should be fine in most cases.
+if !exists('g:pyref_index')
+ if xolox#is_windows()
+ let g:pyref_index = '~/vimfiles/misc/pyref_index'
+ else
+ let g:pyref_index = '~/.vim/misc/pyref_index'
+ endif
+endif
+
+" Local Python documentation as installed by e.g. sudo apt-get install python2.6-doc
+if !exists('g:pyref_python')
+ let g:pyref_python = '/usr/share/doc/python2.6/html'
+endif
+" Local Django documentation as installed by e.g. sudo apt-get install python-django-doc
+if !exists('g:pyref_django')
+ let g:pyref_django = '/usr/share/doc/python-django-doc/html'
+endif
+
+" Automatic command to enable key mapping in Python buffers.
augroup PluginPyRef
autocmd! FileType python call xolox#pyref#enable()
augroup END
-" vim: ts=2 sw=2 et nowrap
+" User command that looks up given argument and supports completion.
+command! -nargs=? -complete=customlist,xolox#pyref#complete PyRef call xolox#pyref#lookup(<q-args>)
+
+" vim: ts=2 sw=2 et
View
182 spider.py 100755 → 100644
@@ -3,87 +3,137 @@
# Copyright 2010 Peter Odding <peter@peterodding.com>
# This program is licensed under the MIT license.
-# This program indexes the keywords and identifiers in the Python language and
-# library reference HTML documentation and creates an index file with keywords
-# and their associated URL. Each line starts with a keyword or an identifier
-# followed by a tab and ends with the associated URL. The index file is used
-# by the pyref.vim plug-in for Vim to provide context sensitive documentation.
+# This Python script indexes a local/remote tree of Python HTML documentation
+# and creates/updates an index file that maps identifiers to their associated
+# URL in the documentation. Each line starts with a keyword or an identifier
+# followed by a tab and ends with the associated URL. The index file is used by
+# the pyref.vim plug-in for Vim to provide context sensitive documentation.
+# For more information visit http://peterodding.com/code/vim/pyref/
-# If you have the HTML documentation available on your hard drive (e.g. by
-# installing the Ubuntu package `python2.6-doc') then I recommend that you
-# index those files by setting the "local_dir" variable below, otherwise
-# http://docs.python.org/library/ will be indexed which can take a while.
+import os
+import re
+import sys
+import time
+import urllib
-local_dir = '/usr/share/doc/python2.6/html/'
-docs_mirror = 'http://docs.python.org/'
-index_file = '~/.vim/misc/pyref_index'
+DEBUG = False
-# You shouldn't need to change anything below here.
+indexfile = os.path.expanduser('~/.vim/misc/pyref_index')
+scriptname = os.path.split(sys.argv[0])[1]
-import os, re, time, urllib
+def message(text, *args):
+ text = '%s: ' + text + '\n'
+ text %= (scriptname,) + args
+ sys.stderr.write(text)
-# If local documentation is available then use that,
-# otherwise default to the latest online documentation.
-selected_docs = os.path.isdir(local_dir) and ('file://' + local_dir) or docs_mirror
+def verbose(text, *args):
+ if DEBUG:
+ message(text, *args)
-def getpage(url):
+def error(text, *args):
+ message(text, *args)
+ sys.exit(1)
+
+# Make sure the Beautiful Soup HTML parser is available.
+try:
+ from BeautifulSoup import BeautifulSoup
+except ImportError:
+ error("""You'll need to install the Beautiful Soup HTML parser. If you're running
+Debian/Ubuntu try the following: sudo apt-get install python-beautifulsoup""")
+
+# Make sure the user provided a location to spider.
+if len(sys.argv) < 2:
+ error("Please provide the URL to spider as a command line argument.")
+
+# Validate/munge the location so it points to an index.html page.
+root = sys.argv[1].replace('file://', '')
+if not root.startswith('http://'):
+ root = os.path.realpath(root)
+ if os.path.isdir(root):
+ page = os.path.join(root, 'index.html')
+ if os.path.isfile(root):
+ root = page
+ else:
+ error("Failed to determine index page in %r!", root)
+ elif not os.path.isfile(root):
+ error("The location %r doesn't seem to exist!", root)
+ root = 'file://' + root
+first_page = root
+root = os.path.split(root)[0]
+
+# If the index file already exists, read it so we can merge the results.
+anchors = {}
+if os.path.isfile(indexfile):
+ message("Reading existing entries from %s", indexfile)
+ handle = open(indexfile)
+ nfiltered = 0
+ for line in handle:
+ anchor, target = line.strip().split('\t')
+ if target.startswith(root):
+ nfiltered += 1
+ else:
+ anchors[anchor] = target
+ handle.close()
+ message("Read %i and filtered %i entries", len(anchors), nfiltered)
+
+# Start from the given location and collect anchors from all related pages.
+queued_pages = [first_page]
+visited_pages = {}
+while queued_pages:
+ location = queued_pages.pop()
+ # Fetch the selected page.
try:
- handle = urllib.urlopen(url)
- contents = handle.read().decode('utf-8')
+ verbose("Fetching %r", location)
+ handle = urllib.urlopen(location)
+ contents = handle.read()
handle.close()
- if not url.startswith('file://'):
- # Rate-limit the number of connections to http://docs.python.org/
+ if not location.startswith('file://'):
+ # Rate limit fetching of remote pages.
time.sleep(1)
- return contents
except:
- print "\rFailed to get %s!" % url
- return ''
-
-pages = []
-
-# Prepare to index the language and library references.
-for directory in 'reference', 'library':
- directory = os.path.join(selected_docs, directory)
- url = os.path.join(directory, 'index.html')
- pattern = '<li class="toctree-l[12]"><a class="reference external" href="([^"]+)'
- print "\rScanning %s" % url,
- for target in re.findall(pattern, getpage(url)):
- # Strip fragment identifiers.
+ verbose("Failed to fetch %r!", location)
+ continue
+ # Mark the current page as visited so we don't fetch it again.
+ visited_pages[location] = True
+ # Parse the page's HTML to extract links and anchors.
+ verbose("Parsing %r", location)
+ tagsoup = BeautifulSoup(contents)
+ npages = 0
+ for tag in tagsoup.findAll('a', href=True):
+ target = tag['href']
+ # Strip anchors and ignore anchor-only links.
target = re.sub('#.*$', '', target)
- # Convert relative to absolute URLs.
- if target.startswith('/') or not target.startswith('http://'):
- target = os.path.join(directory, target)
- if target not in pages:
- pages.append(target)
+ if target:
+ # Convert the link target to an absolute, canonical URL?
+ if not re.match(r'^\w+://', target):
+ target = os.path.join(os.path.split(location)[0], target)
+ scheme, target = target.split('://')
+ target = scheme + '://' + os.path.normpath(target)
+ # Ignore links pointing outside the root URL and don't process any page more than once.
+ if target.startswith(root) and target not in visited_pages and target not in queued_pages:
+ queued_pages.append(target)
+ npages += 1
+ nidents = 0
+ for tag in tagsoup.findAll(True, id=True):
+ anchor = tag['id']
+ if anchor not in anchors:
+ anchors[anchor] = '%s#%s' % (location, anchor)
+ nidents += 1
+ else:
+ verbose("Ignoring identifier %r duplicate target %r!", anchor, location)
+ message("Extracted %i related pages, %i anchors from %r..", npages, nidents, location)
-# Create a dictionary with all anchors in the documentation.
-anchors = {}
-duplicates = 0
-for page in sorted(pages):
- print "\rIndexing %s" % page,
- for anchor in re.findall('\sid="([^"]+)">', getpage(page)):
- url = page + '#' + anchor
- if anchor in anchors:
- # sys.stderr.write("\rConflicting anchors! (%s and %s)\n" % (anchors[anchor], url))
- duplicates += 1
- anchors[anchor] = url
-
-print "\nIndexed %i pages, %i anchors (%i duplicates)" % (len(pages), len(anchors), duplicates)
-
-# Finally write a tab-delimited list of (keyword, URL) pairs to the index file.
-index_file = os.path.expanduser(index_file)
-print "Writing index file %s.. " % index_file,
-handle = open(index_file, 'w')
+message("Scanned %i pages, extracted %i anchors", len(visited_pages), len(anchors))
+
+# Write the tab delimited list of (keyword, URL) pairs to the index file.
+message("Writing index file %r", indexfile)
+handle = open(indexfile, 'w')
bytes_written = 0
-for keyword in sorted(anchors.keys()):
- url = anchors[keyword]
- # Convert absolute to relative URLs.
- if url.startswith(selected_docs):
- url = url[len(selected_docs):]
- line = '%s\t%s\n' % (keyword, url)
+for anchor in sorted(anchors.keys()):
+ line = '%s\t%s\n' % (anchor, anchors[anchor])
handle.write(line)
bytes_written += len(line)
handle.close()
-print "OK, wrote", bytes_written / 1024, "KB"
+message("Done, wrote %i KB to %r", bytes_written / 1024, indexfile)
# vim: ts=2 sw=2 et
Please sign in to comment.
Something went wrong with that request. Please try again.