# <p style="color:red">Chapter 9</p>

### 1.File objects can be used to access not only normal disk files, but also any other type of “file” that uses that abstraction.The open() built-in function (see below) returns a file object that is then used for all succeeding operations on the file in question

### 2.As the key to opening file doors, the open() [and file()] built-in function provides a general interface to initiate the file input/output (I/O) process.

* The open() BIF returns a file object on a successful opening of the file or else results in an error situation

* file_object = open(file_name, access_mode='r', buffering=-1)

* A 'U' mode also exists for universal NEWLINE support (see below).

* Any file opened with mode 'r' or 'U' must exist.

* Any file opened with 'w' will be truncated (clean the data) first if it exists, and then the file is (re)created.

* All writes to files opened with 'a' will be from end-of-file, even if you seek elsewhere during access.

* If the file does not exist, it will be created, making it the same as if you opened the file in 'w' mode.

* If access_mode is not given, it defaults automatically to 'r.'

* The other optional argument, buffering, is used to indicate the type of buffering that should be performed when accessing the file. 
    * A value of 0 means no buffering should occur, 
    * a value of 1 signals line buffering, 
    * and any value greater than 1 indicates buffered I/O with the given value as the buffer size.
    * greater than 1 indicates buffered I/O with the given value as the buffer size.
    * The lack of or a negative value indicates that the system default bufferingscheme should be used, which is line buffering for any teletype or tty-like device and normal buffering for everything else.


### 3. file() factory function
### * Both open() and file() do exactly the same thing and one can be used in place of the other. Anywhere you see references to open(), you can mentally substitute file() without any side effects whatsoever.

### * Generally, the accepted style is that you use open() for reading/writing files, while file() is best used when you want to show that you are dealing with file objects, i.e., if instance(f, file).

### 4.Universal NEWLINE Support (UNS)

### * os module can help you navigate files across different platforms, all of which terminate lines with different endings, i.e., \n, \r, or \r\n.

### * When you use the 'U' flag to open a file, all line separators (or terminators) will be returned by Python via any file input method, i.e., read*(), as a NEWLINE character ( \n ) regardless of what the line-endings are.

### * This feature will also support files that have multiple types of line-endings. A file.newlines attribute tracks the types of line separation characters “seen.”
### * file.newlines is a list that includes all the terminators it meet, otherwise it's None. Note that UNS only applies to reading text files. There is no equivalent handling of file output.

### 5. File methods come in four different categories: input, output, movement within a file, which we will call “intra-file motion,” and miscellaneous.

#### a. input
#### * read(size_of_bytes=-1):  read bytes directly into a string, reading at most the number of bytes indicated. If no size is given (the default value is set to integer -1) or size is negative, the file will be read to the end.

#### * readline(size_fo_bytes=-1): read a line and return the line ( including the line-terminator), size_to_read defaults to -1, mean- ing read until the line-ending characters (or EOF) are found. If present, it is possible that an incomplete line is returned if it exceeds size bytes.

#### * readlines(sizhint):  it reads all (remaining) lines and returns them as a list of strings. sizhint, is a hint on the maximum size desired in bytes. If provided and greater than zero, approximately sizhint bytes in whole lines are read (perhaps slightly more to round up to the next buffer size) and returned as a list.



### b. output

#### * write(): It takes a string that can consist of one or more lines of text data or a block of bytes and writes the data to the file.
#### * writelines(): operates on a list just like readlines(), but takes a list of strings and writes them out to a file. Line termination charac- ters are not inserted between each line, so if desired, they must be added to the end of each line before writelines() is called.

#### * When reading lines in from a file using file input methods like read() or readlines(), Python does not remove the line termination characters. It is up to the programmer.

#### * Similarly, output methods like write() or writelines() do not add line terminators for the programmer... you have to do it yourself before writing the data to the file.

### C. intra-file motion
#### * seek() method (analogous to the fseek() function in C) moves the file pointer to different positions within the file. 
#### * The offset in bytes is given along with a relative offset location, whence. #### * A value of 0, the default, indicates dis-tance from the beginning of a file (note that a position measured from the beginning of a file is also known as the absolute offset),
#### * a value of 1 indicates movement from the current location in the file, and #### * a value of 2 indicates that the offset is from the end of the file.  Use of the seek() method comes into play when opening a file for read and write access.
#### * tell() is a complementary method to seek(); it tells you the current location of the file—in bytes from the beginning of the file.

### 5. file iteration:
    

In [2]:
for eachLine in f: # f is an iterator, f.next()read in the next line
    pass
#Going through a file line by line is simple:

NameError: name 'f' is not defined

### In file iteration, file objects became their own iterators, meaning that users could now iterate through lines of a file using a for loop without having to call read*() methods. Alternatively, the iterator next method, file.next() could be called as well to read in the next line in the file. Like all other iterators, Python will raise StopIteration when no more lines are available.

### 6.close file
#### * close() method completes access to a file by closing it.

#### * The Python gar- bage collection routine will also close a file when the file object reference has decreased to zero.

#### * The fileno() method passes back the file descriptor to the open file.

#### * Rather than waiting for the (contents of the) output buffer to be written to disk, calling the flush() method will cause the contents of the internal buffer to be written (or flushed) to the file immediately.

#### * isatty() is a Boolean built-in method that returns True if the file is a tty-like device and False otherwise. 
#### * The truncate() method truncates the file to the size at the current file position or the given size in bytes.

### 7. os module

* linesep: String used to separate lines in a file
* sep: String used to separate file pathname components
* pathsep: String used to delimit a set of file pathnames
* curdir: String name for current working directory
* pardir: String name for parent (of current working directory)

#### comma placed at the end of the print statement is to suppress the NEWLINE character that print normally adds at the end of output.

#### truncate() method, which takes one optional argument, size. If it is given, then the file will be truncated to, at most, size bytes. If you call truncate() without passing in a size, it will default to the current location in the file.


In [8]:
import os
filename = input('Enter file name: ') 
fobj = open(filename, 'w')
while True:
    aLine = input("Enter a line ('.' to quit): ") 
    if aLine != ".":
        fobj.write('%s%s' % (aLine, os.linesep)) 
    else:
        break
fobj.close()
    

Enter file name: test_input.txt
Enter a line ('.' to quit): This is a beautiful girl!
Enter a line ('.' to quit): agaghoahg asgha.
Enter a line ('.' to quit): .


#### raw_input() does not preserve it from the user input


### 8. file attributes
#### * file.closed:
#### * file.encoding:Encoding that this file uses—when Unicode strings are written to file, they will be converted to byte strings using file.encoding; a value of None indicates that the system default encoding for converting Unicode strings should be used
#### * file.mode:
#### * file.name: 
#### * file.newlines: None if no line separators have been read, a string con- sisting of one type of line separator, or a tuple containing all types of line termination characters read so fa
#### * file.softspace: 0 if space explicitly required with print, 1 otherwise; rarely used by the programmer—generally for internal use only

### 9. There are generally three standard files that are made available to you when your program starts. These are standard input (usually the keyboard), stan- dard output (buffered output to the monitor or display), and standard error (unbuffered output to the screen). (The “buffered” or “unbuffered” output refers to that third argument to open())

* These files are named stdin, stdout, and stderr

### 10. Python makes these file handles available to you from the sys module. Once you import sys, you have access to these files as sys.stdin, sys.stdout, and sys.stderr. The print statement normally outputs to sys.stdout while the raw_input() built-in function receives its input from sys.stdin.

### 11. Just remember that since sys.* are files, you have to manage the line sep- aration characters. The print statement has the built-in feature of automati- cally adding one to the end of a string to output.

### 12. The sys module also provides access to any command-line arguments via sys.argv.
### * sys.argv is the list of command-line arguments
### * sys.argv[0], is always the program name.
### * len(sys.argv) is the number of command-line arguments (aka argc)

### 12. Python has two modules to help process command-line arguments. The first (and original), getopt is easier but less sophisticated, while optparse, introduced in Python 2.3, is more powerful library and is much more object-oriented than its predecessor. If you are just getting started, we recommend getopt, but once you outgrow its feature set, then check out optparse.

### 13. File system



#### a.  Access to your file system occurs mostly through the Python os module. This module serves as the primary interface to your operating system facilities and services from Python. The os module is actually a front-end to the real mod- ule that is loaded, a module that is clearly operating system–dependent.

#### b.In addition to managing processes and the process execution environment, the os module performs most of the major file system operations that the application developer may wish to take advantage of. These features include removing and renaming files, traversing the directory tree, and managing file accessibility. 

#### c. os.path performs specific pathname operations. The os.path module is accessible through the os module. os. path provides functions to manage and manipulate file pathname components, obtain file or directory information, and make file path inquiries. 

#### d. os module and os.path allow for consistent access to the file system regardless of platform or operating system. 

#### os module file processing
#### * mkfifo()/mknod(): Create named pipe/create filesystem node

#### * remove()/unlink():  Delete file

#### * rename()/renames(): Rename file

#### * statc(): Return file statistics

#### * symlink(): Create symbolic link

#### * utime(): Update timestamp

#### * tmpfile(): Create and open ('w+b') new temporary file

#### * walk(): Generate filenames in a directory tree

#### os module directory access functions:
#### * chdir()/fchdir(): Change working directory/via a file descriptor
#### * chroot(): Change root directory of current process
#### * listdir(): List files in directory
#### * getcwd()/getcwdu(): Return current working directory/same but in Unicode
#### * mkdir()/makedirs(): Create directory(ies)
#### * rmdir()/removedirs(): Remove directory(ies)

### os access/permissions:

#### * access(): Verify permission modes

#### * chmod(): Change permission modes

#### * chown()/lchown(): Change owner and group ID/same, but do not follow links

#### * umask(): Set default permission modes

### Separation

#### * basename(): Remove directory path and return leaf name
#### * dirname(): Remove leaf name and return directory path
#### * join(): Join separate components into single pathname
#### * split(): Return (dirname(), basename()) tuple
#### * splitdrive(): Return (drivename,pathname) tuple
#### * splitext(): Return (filename, extension) tuple

### 14. The *db* series of modules writes data in the traditional DBM format.
#### *  but if you are not sure or do not care, the generic anydbm module detects which DBM-compatible modules are installed on your system and uses the “best” one at its disposal.

#### a. marshal and pickle provide serialization or pickling of Python objects

#### b. *db* provide dictionary- and file-like object to allow for persistent storage of strings

#### c. shelve provides serialization or pickling of Python objects as well as a dictionary- and file-like object to allow for persistent storage of such flattened objects

#### d. The shelve module uses the anydbm module to find a suitable DBM module, then uses cPickle to perform the pickling process.

#### e. The two main functions in the pickle module are dump() and load(). The dump() function takes a file handle and a data object and saves the object in a format it understands to the given file.

#### f.The fileinput module iterates over a set of input files and reads their con- tents one line at a time, allowing you to iterate over each line.  File names that are not explicitly given will be assumed to be provided from the command-line.

#### g.The glob and fnmatch modules allow for file name pattern-matching in the good old-fashioned Unix shell-style, for example, using the asterisk ( * ) wildcard character for all string matches and the ( ? ) for matching single characters.

#### h.In addition, Unix-flavored systems also support the “~user” notation indicating the home directory for a specific user. 

#### i. The gzip and zlib modules provide direct file access to the zlib com- pression library. The gzip module, written on top of the zlib module, allows for standard file access, but provides for automatic gzip-compatible compression and decompression. bz2 is like gzip but for bzipped files.

#### j.The shutil module furnishes high-level file access, performing such functions as copying files, copying file permissions, and recursive directory tree copying, to name a few.

#### k.Some other Python modules that generate file-like objects include network and file socket objects (socket module), the popen*() file objects that con- nect your application to other running processes (os and popen2 modules), the fdopen() file object used in low-level file access (os module), and opening a network connection to an Internet Web server via its Uniform Resource Loca- tor (URL) address (urllib module). Please be aware that not all standard file methods may be implemented for these objects. Likewise, they may provide functionality in addition to what is available for regular files.