Skip to content

Commit

Permalink
AVRO-1302. Python: Update documentation to open files as binary to pr…
Browse files Browse the repository at this point in the history
…event EOL substitution. Contributed by Lars Francke.

git-svn-id: https://svn.apache.org/repos/asf/avro/trunk@1637264 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information
cutting committed Nov 6, 2014
1 parent 535f6ef commit 97165ab
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 6 deletions.
3 changes: 3 additions & 0 deletions CHANGES.txt
Expand Up @@ -55,6 +55,9 @@ Trunk (not yet released)
AVRO-1489. Java: Avro fails to build with OpenJDK 8. (Ricardo Arguello via
tomwhite)

AVRO-1302. Python: Update documentation to open files as binary to
prevent EOL substitution. (Lars Francke via cutting)

Avro 1.7.7 (23 July 2014)

NEW FEATURES
Expand Down
20 changes: 14 additions & 6 deletions doc/src/content/xdocs/gettingstartedpython.xml
Expand Up @@ -136,14 +136,14 @@ import avro.schema
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter

schema = avro.schema.parse(open("user.avsc").read())
schema = avro.schema.parse(open("user.avsc", "rb").read())

writer = DataFileWriter(open("users.avro", "w"), DatumWriter(), schema)
writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
writer.append({"name": "Alyssa", "favorite_number": 256})
writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
writer.close()

reader = DataFileReader(open("users.avro", "r"), DatumReader())
reader = DataFileReader(open("users.avro", "rb"), DatumReader())
for user in reader:
print user
reader.close()
Expand All @@ -153,11 +153,19 @@ reader.close()
{u'favorite_color': None, u'favorite_number': 256, u'name': u'Alyssa'}
{u'favorite_color': u'red', u'favorite_number': 7, u'name': u'Ben'}
</source>
<p>
Do make sure that you open your files in binary mode (i.e. using the modes
<code>wb</code> or <code>rb</code> respectively). Otherwise you might
generate corrupt files due to
<a href="http://docs.python.org/library/functions.html#open">
automatic replacement</a> of newline characters with the
platform-specific representations.
</p>
<p>
Let's take a closer look at what's going on here.
</p>
<source>
schema = avro.schema.parse(open("user.avsc").read())
schema = avro.schema.parse(open("user.avsc", "rb").read())
</source>
<p>
<code>avro.schema.parse</code> takes a string containing a JSON schema
Expand All @@ -167,7 +175,7 @@ schema = avro.schema.parse(open("user.avsc").read())
user.avsc schema file here.
</p>
<source>
writer = DataFileWriter(open("users.avro", "w"), DatumWriter(), schema)
writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
</source>
<p>
We create a <code>DataFileWriter</code>, which we'll use to write
Expand Down Expand Up @@ -201,7 +209,7 @@ writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
ignored.
</p>
<source>
reader = DataFileReader(open("users.avro", "r"), DatumReader())
reader = DataFileReader(open("users.avro", "rb"), DatumReader())
</source>
<p>
We open the file again, this time for reading back from disk. We use
Expand Down

0 comments on commit 97165ab

Please sign in to comment.