Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Branch: master
Fetching contributors…

Cannot retrieve contributors at this time

553 lines (546 sloc) 17.459 kB

3/11/2011

GDB7 -> Python

python gdb command

gdb module imported

Clouds

`OpenStack’

Start Here

ansolabs.com/deploy

gh/ansolabs

openstack cookbook + chef

Interesting “Req”

`Vagrant’

API for vbox

gist/786945/solo.rb
euca-add-keypair
euca-run-instances
euca-descibe-instances

`libvertd’ && `libvertclient’

VM Panel

Convore

https://convore.com/pycon-2011/python-vm-panel/

Speakers

CPython (Dr. Brett Cannon)

Unladen Swallow
Last part of development time was spent updating LLVM
Integration to CPython: “It’s resting”
Exists to ref-imp
Language Moratorium (2yrs)
Expired “last month” (3/11/11)
3.1~ish stop changing the language
Anything that would make “Books change”
Coming in 3.3 thanks to expiration
PEP-380 – yeild from
Sandboxing
Sandboxing was a good “dead end for my thesis”
Language freedom competes with sandboxing on goals
Language level sandboxing is hard, VM level is correct

PyPy (Maciej Fijalkowski)

Ctypes
Has branch to make it faster
cpyx(?)::ironclad
“Emulates refcounting”
Quotes

“Fast, and has no differences. Use it if you like CPython and you want it to run fast.”

Perf

http://speed.pypy.org

Sandboxing built in

Jython (Frank Wierzbicki)

Doesn’t have Ctypes
Wants it from IronPython, more or less
“PBVCM” or something like it
Bytecode interpreter
JVM GC
Plan
“Loves to skip releases”
Make 2.6, go to 3 and see if they need 2.7
Trivia
Using unicode internally all the time

Iron Python (Dino Viehland)

Intends to beat pypy to 2.7
Aiming for this week?
“MSPL” => APache
OSI aproved?! wtf.
Switched to Apache because people “were nervous”
Sandboxing
Does not want sandboxing as a language feature
Rather catch perf, 3
Sandboxing is from .NET
Ctypes
Since 2.6
“Encourage everyone to use Ctypes”
Most portable way to expose code
Resolver Systems – IronClad
Support for running C extensionsNeeds help, C#

Backup is hard – BitBacker

Convore

https://convore.com/pycon-2011/backup-is-hard-lets-go-shopping-gary-bernhardt/

Bigness

There are too many files

Generators

API

restlib.py – Opensource?

URL Story

There was a 4M char URL..

HTTP Client

asyncore

S3 (2007-2008)

~100ms write latency (At the time)
Writes 500, 503, fail, bounce in ways

You can use HTTPlib as just a parser!

Replace def connect()
Use fake socket that uses StringIO

asynchttp – Opensource

The System

Crawl->Diff->spool–{n}–>Chunks->Store

Crawl’s a generator

Manifest that is diffed is cached

OpenSource

Dingus – Mocking

blog.extracheese.org

Will post soon
restlib, asynchttp with new names

Author

Working on “Destroy All Software”

Something to do with screencasts

Context Managers and Decorators

DecoratorManagers

from contextlib import contextmanager

functools.lru_cache

functools.total_ordering

from tempfile import NamedTemporaryFile

with NamedTemporaryFile(‘w+’) as tmp: #etc

Ian Bicking’s brain – PIP/Virtualenv

site-packages

On boot “site.py” is loaded to find site-packages

site.py is in the stdlib
sys.prefix and sys.exec_prefix

–I MISSED THINGS– :(

Explanation of hand-rolling venv
Fails (But works in that regard) because of missing site-pacages

virtualenv

Code can detect presense of venv

Check for sys.real_prefix

Bootstrap (site.py) requires stdlib

os, etc
Linked to bootstrap

venv forks site.py

Alts

–missed one–
rvirtualenv
PYTHONHOME
pythonv
“Might be like virtualenv in python itself”

PIP

“Ugliest. Hack. Ever. (Sorry, Ian)”

Does: import, pick setup file, execfile

:(
But hey, it works.

Always uses setuptools

–record

Record everything that’s getting installed.

Using PIP as an API

RequirementSet object
InstallRequirement object
reqset.add_requirement(IR)
finder = PackageFinder(…)
reqset.
pepare_files
install
cleanup_SOMETHNIG

Extreme Network Programming

sigslot

observer

multicast on if with no route to sender drops packet

Multicast reliablity

RFC

5740 – NORM
5053

SOL_PACKET not defined in python sockets module

tuntap

tun -> IP only

top -> has ethernet

dpkt

str(pkt) -> packet buffer

Fragmented packets are decoded wrong

Bridging

Problems

ETH_P_ALL will see it’s own packets
loops

Solution

packet type of sockaddr_ll

Fault/Distrupt tolerant

Open Source Impl

dtn.sf.net

RFC 4838

Can networks work with uptime < 5% (e.g. space)

Would web services work between earth and saturn with 100% uptime?
No.
Interactive, syn/ack fail
LS limit

Bundle Protocol

Store-and-forward

Endpoints are URIs

No IP
No DNS
dtn://mode.hierarchy.dtn/path/to/service

Peronal note

This smells a little bit like 0mq’s ideas of abstractions for sockets

Rev. Healy

Awesome.

3/12/2011

Python high performance computing

Portabilty important

Hardware changes fast, frequently

Languages used

C/C++, Fortran, Python

Sometimes, someone tries to run lisp

Quote

“People are doing high performance computing with Python… how do we stop them?” -Senior Perfromance Engineer

Easy to Optimize

Easy to embed critical sections

Only spend time on speed if really needed

2 BlueGene /P Systems

Surveyor

1024 nodes
4096 cores

Intrepid

40960 nodes
163840 cores

Justification

Common computers will fail at a rate of something like 1/45seconds

Arch

Chip
PPC, like the wii but with more FPU
Node card32 chips, 4x4x2Rack32 node cards

Viz systems

OS: SLES 10 SP 2

80TB -> Disk

In Minutes.

7PB filled in a few days

Parallelism

GPU

PyOpenCL
Slower than CUDA, more abstraction
PyCUDA

MPI

Preferred at LCF
MPI standardized for years.
mpi4py
Instead of passing around C datatypes pass around python objects
Uses pickling, overhead
Continous C buffers go at wire speed

Gpaw (Project)

DFT (Would take three weeks to explain what the hell that is)

Nature, Chemistry, etc

Real, published science is being done.

Downsides

Difficult on weird OSes

Polling

Thread management

Dynamic loading

Not pleasant at scale

Paralellel filesystems

don’t like metadata traffic
don’t like small files

1h 45m boot for hello world on large cluster.

Python hacked to read files at one place, and MPI broadcasts them to the cluster

Live demo of pi computation on Intrepid

40960 cores in use + MPI

http://status.alcf.anl.gov

Writing Great Documentation

Documentation Culture

Concepts apply wider than OSS

Why people read docs?

Everyone

New users need docs

Education

New and existing users need to learn

Support

Experienced users
e.g.

Leave a note for things you may need to lookup, even yourself

Troubleshooting

Annoyed users

Internals

Fellow developers
Contribution guides!

How do I contribute to a project?

“How do I contribute to company X?” Good for internal, private codes, too.

Refference

“Does anyone here know strf formatting?”

Documentation is communication

Silent users
No IRC, No lists, no Reddit
They will only read your docs
Have same things in the docs that you would tell peole on IRC

Great documentation has to serve multiple conflicting masters

“Pitch your communication to the people talking to you”
“You have to document things multiple times”

Who should write docs?

Developers

Great documentation is written by great developers

Channel excitement into prose

Anyone.

But…
“A great way for new contributors to get started with our project is to contribute documentation”
New users don’t know how your code works yet!
1) Wiki -> 2) ? -> 3) GREAT DOCS!
This does not work.Chef does it. It’s

Perfection is not possible.

Just because you have little or crap now, doesn’t mean it won’t improve
Don’t let it stop contribution.

“Documentation driven development” (Another talk here)

Write docs first.
See TDD

What should be documented?

Tutorials

“Don’t use tutorials as a loss leader to sell books”

“Show off how the project feels”
quick
Something you can win at in 30 minutes
30 minutes is important.Developer will trust you in 30 minutesGetting someone to a positive experience is important
Easy
Help users feel epic win
Not too easy
Don’t sugar-coat the truth
Should weed out people that shouldn’t use the project
Your intented audience should be able to complete it

Topic guides

!Tutorials
Don’t hold hands, not copy pasta
Conceptual
Foster understanding not parroting
Comprehensive
Explain in detail

This is not the time to gloss over things

Tell me the “Why” of the topic

Refference

Complete

“Docs or it didn’t happen”

Telling people to read the source is not an option
Auto-generated reference is almost worth it
No substitute for documentation written by people
Don’t try to make software write in english

Troubleshooting

Answers to question asked in anger
FAQ are good with the Qs are actually FA’d
“the one I forgot about” “the one everyone forgets about”

Documentation is fractal

Each part of the docs has subsections that are recursive in this process

Get the slides, has good table

Which tools should we use?

Tools [almost] don’t matter

For the most part

“Tools don’t matter, but there are good ones, so use them”

Suggestions
Sphinx
ReadtheDocs
Literal programming
Good for generating guides

Tool suggestion for end-user docs (Screenshots?)

Sphinx
Screencasts
Good for UI
Personal: I still fucking hate them.

Design Justification

Helps internalize the design to users

Is it right to not document things that you intend to change?

No.

Every public API has documentation != Every private API has no documentation

Balance between “Documented means public” and “Stop doing that, you’ll put an eye out”

Api design antipatterns

http://www.aleax.it/pycin11_adap.pdf

Worst

Accept SQL directly from users

“Exposes you to the Bobby Tables problem”

Not having an API

Not having a design

API that “Just happened”

Too many APIs is also not good

Zero is wrong, Three is too many, four is right out

One, maybe two is the “sweet spot”

No one ever fully transitions to “the new API”

Root cause:

Transition from “not-really-designed” API

Cause 2.0:

Lack/Fear of commitment

No benefit to the user

Can do A in one, B in another, but not A and B together in either

No silver bullets for API visioning

Too “extreme”

Lack of balance of concerns

Don’t force your implementation language on your API

Don’t expose language-specific data
NO FUCKING PICKLES
God, I fucking hate pickles.
REST
In other cases, still REST. >:[

Undocumented APIs are as good as nonexistant

Docs need examples

Examples should be tested

Keep docs updated

Wrong docs are worse than no docs

Most SO questions are about lacking APIs

Screen scraping, etc

Users want APIs

If you don’t have an API users will try to scrape your UI

Causes extra load if programs have to pretend to be people

Minor tweaks break other peoples systems

Gives your competitors a nice entry

Even crappy API is better than no API

If you’re unwilling to API

Follow the Path of least resistance

Don’t reinvent a square wheel

Web
REST/JSON
Windows
COM
Mac
Applescript

APIs are the exception to the “Don’t design up-front” rule

Accidental API

Bad because:

You’re exposing internal implementation issues
Hard to change underlying impl. without borking API
You either won’t change it or break clients
Or start growing many APIs
UIs are mostly used by humans
They are far more flexible.

Instead do:

Think about the API
“If I were an outside programmer…”
Don’t just think it, implement it in your tools/scripts/aux
Think of multiple alternative implementations of your system
The common parts are what should be in the APIThe differences are accidents.

No design, just showed up

“Wait! We already have an API!”

Whatever your UI uses
Sometimes the DB itself

Why do so much work?

Thinking so much about your API gives insight into your system

Leads to better architecture
More modular systems

Helps you properly divvy up the system

Small as possible “in the core”
As much as can be “outside”

Layering APIs

One lowest level

Expose logical parts, not implementation
Can be unfriendly as long as it’s complete and fast

ALL Higher layers of API are implemented on lower levels of the API stack

“Let’s do both” problem

Design is choice

More often org problems, not personal problem

e.g.

Win32 CreateFile(………….) vs Unix open(path, flags, mode)
gmpy
Methods on instances vs unified method?
FUCKIN’ BOTH!No. Don’t.

Perfection is a bad excuse for not trying.

You won’t “get it right the first time”

Iterate

Inconsistency

Arg ordering

under_scoreCamelCase

Mismatched verbs

Remove/Delete/Erase
Plurality

Accronyms

Don’t upcase acronyms in names! HttpSendQuery vs HTTPSendQuery

Why?

Things change over time.
Different people conceptualize differently

Instead:

“data dictionary”
Map words to concepts concretely
New concepts get added first, then implemented
Saves decision overhead

Debugging

You and users will make mistakes

Error message of “an error occured”

Performance issues

Excessive “round trips”

Improper support threading/distribution

No async support

Too-picky about timing

Exhibition of Atrocity

Hungarian notation

PEP8

Lining things up is hilarious

Lots of lambdas

Encourages short variables

Making it harder

getters and setters that do nothing

Data Transfer Objects

God modules are a smell.

Refactor.

Global state

Diaper pattern

Bare excepts

ASIN.cc/~11GQo – Clean code

snake-guice

mock

Advanced Networking with 0MQ

0mq explodes if you knock :(

grace

C++

Handling ridiculous amounts of data with probabilistic data structures

Open Research Computation

$10K/200G to generate sequence

x1000 sequencers

nonlinear scale

‘de Bruin graph approach to assembly’

Through the side channel

Side channel attack

antioptimization

analyze(Performance + problem domain) -> inputs

Find the research for locating yourself within the same physical host to an EC2 target

pow(a, b, c) has sawtooth performance with largely free performance for factors of the mod

Timing security sensitive code is an important test

QA

Random sleep is wrong

Instead, don’t early-terminate for more even timings
Exposes data size instead, eases brute-force
Perform secret checks in O(1), in every case

3/13/2011

Outsiders look at cooroutines

c10k problem

Gevent

libevent + greenlet

PEP-342

Scaling past 100

Allura

SF tools

sourceforge.net/p/allura

Mongo mongo mogno mongo mongo

Mongo mongo mogno mongo mongo

Mongo mongo mogno mongo mongo

Best practices for impossible deadlines

@onyxfish

github.com/newsapps

“Features no one’s using? Get rid of it.”

Minimal legacy maintence

“Migration introduce more bugs than anything else”

Strive for read-only app

Try to render statically OOB

“Never underestimate the value of not having to care how something works.”

“Bees with machine guns”

DDOS tool that spins up in EC2

ushahidi

Crowdsourcing

google-refine

Cleans up bad data

Has python API in early stages

Jump to Line
Something went wrong with that request. Please try again.