In [2]:
import re
start_with_hash = 0
with open("sample.txt" , "r+") as f:
    for line in f:
        if (re.match('^#' , line)):
            start_with_hash += 1

print(start_with_hash)

3


In [3]:
from collections import Counter
def get_domain(email_address):
    ''' return the domain from email address'''
    return email_address.lower().split('@')[-1]

with open('sample.txt' , 'r') as f:
        
    domain_counter = Counter(get_domain(line.strip()) for line in f if '@' in line)

domain_counter

Counter({'example.com': 2,
         'example.org': 1,
         'example.net': 1,
         'example.edu': 1})

### Delimited FIles 
The hypothetical email addresses file we just processed had one address per line. More
frequently you’ll work with files with lots of data on each line. These files are very often
either comma-separated or tab-separated. Each line has several fields, with a comma (or a
tab) indicating where one field ends and the next field starts.

This starts to get complicated when you have fields with commas and tabs and newlines in
them (which you inevitably do). For this reason, it’s pretty much always a mistake to try to
parse them yourself. Instead, you should use Python’s csv module (or the pandas library).

For technical reasons that you should feel free to blame on Microsoft, you should always
work with csv files in binary mode by including a b after the r or w (see Stack Overflow).

If your file has no headers (which means you probably want each row as a list, and
which places the burden on you to know what’s in each column), you can use csv.reader
to iterate over the rows, each of which will be an appropriately split list.

In [19]:
# if we have tab delimited files we can process them like this 

import  csv
def process(date , symbol , closing_price):
    print(f'Date : {date} Symbol: {symbol} Closing Price: {closing_price}')

with open('stock_prices.txt', 'r') as f:
    reader = csv.reader(f, delimiter='\t')
    for row in reader:
        date = row[0]
        symbol = row[1]
        closing_price = float(row[2])
        process(date, symbol, closing_price)

Date : 6/20/2014 Symbol: AAPL Closing Price: 90.91
Date : 6/20/2014 Symbol: MSFT Closing Price: 41.68
Date : 6/20/2014 Symbol: FB Closing Price: 64.5
Date : 6/19/2014 Symbol: AAPL Closing Price: 91.86
Date : 6/19/2014 Symbol: MSFT Closing Price: 41.51
Date : 6/19/2014 Symbol: FB Closing Price: 64.34


If your file has headers:
```csv
date:symbol:closing_price
6/20/2014:AAPL:90.91
6/20/2014:MSFT:41.68
6/20/2014:FB:64.5
```
you can either skip the header row (with an initial call to reader.next()) or get each row
as a dict (with the headers as keys) by using csv.DictReader:
```python
with open('colon_delimited_stock_prices.txt', 'rb') as f:
    reader = csv.DictReader(f, delimiter=':')
    for row in reader:
        date = row["date"]
        symbol = row["symbol"]
        closing_price = float(row["closing_price"])
        process(date, symbol, closing_price)
```
Even if your file doesn’t have headers you can still use DictReader by passing it the keys
as a fieldnames parameter.

You can similarly write out delimited data using csv.writer:

```python 
today_prices = { 'AAPL' : 90.91, 'MSFT' : 41.68, 'FB' : 64.5 }
with open('comma_delimited_stock_prices.txt','wb') as f:
    writer = csv.writer(f, delimiter=',')
    for stock, price in today_prices.items():
        writer.writerow([stock, price])

```

csv.writer will do the right thing if your fields themselves have commas in them. Your
own hand-rolled writer probably won’t. 

For example, if you attempt:
```python
results = [["test1", "success", "Monday"],
["test2", "success, kind of", "Tuesday"],
["test3", "failure, kind of", "Wednesday"],
["test4", "failure, utter", "Thursday"]]
# don't do this!

with open('bad_csv.txt', 'wb') as f:
    for row in results:
    f.write(",".join(map(str, row))) # might have too many commas in it!
    f.write("\n") # row might have newlines as well!
```

You will end up with a csv file that looks like:
```csv
test1,success,Monday
test2,success, kind of,Tuesday
test3,failure, kind of,Wednesday
test4,failure, utter,Thursday
```
and that no one will ever be able to make sense of.