# Pandas

Материалы:
* Макрушин С.В. "Лекция 2: Библиотека Pandas"
* https://pandas.pydata.org/docs/user_guide/index.html#
* https://pandas.pydata.org/docs/reference/index.html
* Уэс Маккини. Python и анализ данных

## Задачи для совместного разбора

1. Загрузите данные из файла `sp500hst.txt` и обозначьте столбцы в соответствии с содержимым: `"date", "ticker", "open", "high", "low", "close", "volume"`.

In [3]:
import numpy as np
import pandas as pd

In [8]:
data = pd.read_csv('data/sp500hst.txt', header=None, names=["date", "ticker", "open", "high", "low", "close", "volume"], parse_dates=["date"])
data

Unnamed: 0,date,ticker,open,high,low,close,volume
0,2009-08-21,A,25.60,25.6100,25.220,25.55,34758
1,2009-08-24,A,25.64,25.7400,25.330,25.50,22247
2,2009-08-25,A,25.50,25.7000,25.225,25.34,30891
3,2009-08-26,A,25.32,25.6425,25.145,25.48,33334
4,2009-08-27,A,25.50,25.5700,25.230,25.54,70176
...,...,...,...,...,...,...,...
122569,2010-08-13,ZMH,51.72,51.9000,51.380,51.44,14561
122570,2010-08-16,ZMH,51.13,51.4700,50.600,51.00,13489
122571,2010-08-17,ZMH,51.14,51.6000,50.890,51.21,20498
122572,2010-08-19,ZMH,51.63,51.6300,50.170,50.22,18259


2. Рассчитайте среднее значение показателей для каждого из столбцов c номерами 3-6.

In [12]:
data[["open", "high", "low", "close"]].mean()

open     42.595458
high     43.102243
low      42.054464
close    42.601865
dtype: float64

3. Добавьте столбец, содержащий только число месяца, к которому относится дата.

In [17]:
data["month"] = data["date"].dt.month
data

Unnamed: 0,date,ticker,open,high,low,close,volume,month
0,2009-08-21,A,25.60,25.6100,25.220,25.55,34758,8
1,2009-08-24,A,25.64,25.7400,25.330,25.50,22247,8
2,2009-08-25,A,25.50,25.7000,25.225,25.34,30891,8
3,2009-08-26,A,25.32,25.6425,25.145,25.48,33334,8
4,2009-08-27,A,25.50,25.5700,25.230,25.54,70176,8
...,...,...,...,...,...,...,...,...
122569,2010-08-13,ZMH,51.72,51.9000,51.380,51.44,14561,8
122570,2010-08-16,ZMH,51.13,51.4700,50.600,51.00,13489,8
122571,2010-08-17,ZMH,51.14,51.6000,50.890,51.21,20498,8
122572,2010-08-19,ZMH,51.63,51.6300,50.170,50.22,18259,8


4. Рассчитайте суммарный объем торгов для для одинаковых значений тикеров.

In [32]:
data["total"] = data["close"] * data["volume"]
data.groupby("ticker").sum()["total"]

ticker
A       2.604052e+08
AA      1.084986e+09
AAPL    1.174010e+10
ABC     2.521959e+08
ABT     9.612890e+08
            ...     
XTO     9.628561e+08
YHOO    9.089127e+08
YUM     4.108175e+08
ZION    3.088669e+08
ZMH     2.747670e+08
Name: total, Length: 524, dtype: float64

5. Загрузите данные из файла sp500hst.txt и обозначьте столбцы в соответствии с содержимым: "date", "ticker", "open", "high", "low", "close", "volume". Добавьте столбец с расшифровкой названия тикера, используя данные из файла `sp_data2.csv` . В случае нехватки данных об именах тикеров корректно обработать их.

In [56]:
data = pd.read_csv('data/sp500hst.txt', header=None, names=["date", "ticker", "open", "high", "low", "close", "volume"], parse_dates=["date"])
names = pd.read_csv('data/sp_data2.csv', sep=";", header=None, names=["ticker", "ticker_name", ""])
data = pd.merge(data, names, left_on='ticker', right_on='ticker', how='left')[["date", "ticker", "ticker_name", "open", "high", "low", "close", "volume"]]
data

Unnamed: 0,date,ticker,ticker_name,open,high,low,close,volume
0,2009-08-21,A,Agilent Technologies,25.60,25.6100,25.220,25.55,34758
1,2009-08-24,A,Agilent Technologies,25.64,25.7400,25.330,25.50,22247
2,2009-08-25,A,Agilent Technologies,25.50,25.7000,25.225,25.34,30891
3,2009-08-26,A,Agilent Technologies,25.32,25.6425,25.145,25.48,33334
4,2009-08-27,A,Agilent Technologies,25.50,25.5700,25.230,25.54,70176
...,...,...,...,...,...,...,...,...
122569,2010-08-13,ZMH,,51.72,51.9000,51.380,51.44,14561
122570,2010-08-16,ZMH,,51.13,51.4700,50.600,51.00,13489
122571,2010-08-17,ZMH,,51.14,51.6000,50.890,51.21,20498
122572,2010-08-19,ZMH,,51.63,51.6300,50.170,50.22,18259


## Лабораторная работа №2

### Базовые операции с `DataFrame`

1.1 В файлах `recipes_sample.csv` и `reviews_sample.csv` находится информация об рецептах блюд и отзывах на эти рецепты соответственно. Загрузите данные из файлов в виде `pd.DataFrame` с названиями `recipes` и `reviews`. Обратите внимание на корректное считывание столбца с индексами в таблице `reviews` (безымянный столбец).

In [1]:
import numpy as np
import pandas as pd

In [2]:
recipes = pd.read_csv('data/recipes_sample.csv', parse_dates=['submitted'])
recipes

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients
0,george s at the cove black bean soup,44123,90,35193,2002-10-25,,an original recipe created by chef scott meska...,18.0
1,healthy for them yogurt popsicles,67664,10,91970,2003-07-26,,my children and their friends ask for my homem...,
2,i can t believe it s spinach,38798,30,1533,2002-08-29,,"these were so go, it surprised even me.",8.0
3,italian gut busters,35173,45,22724,2002-07-27,,my sister-in-law made these for us at a family...,
4,love is in the air beef fondue sauces,84797,25,4470,2004-02-23,4.0,i think a fondue is a very romantic casual din...,
...,...,...,...,...,...,...,...,...
29995,zurie s holey rustic olive and cheddar bread,267661,80,200862,2007-11-25,16.0,this is based on a french recipe but i changed...,10.0
29996,zwetschgenkuchen bavarian plum cake,386977,240,177443,2009-08-24,,"this is a traditional fresh plum cake, thought...",11.0
29997,zwiebelkuchen southwest german onion cake,103312,75,161745,2004-11-03,,this is a traditional late summer early fall s...,
29998,zydeco soup,486161,60,227978,2012-08-29,,this is a delicious soup that i originally fou...,


In [3]:
reviews = pd.read_csv('data/reviews_sample.csv', index_col=0)
reviews

Unnamed: 0,user_id,recipe_id,date,rating,review
370476,21752,57993,2003-05-01,5,Last week whole sides of frozen salmon fillet ...
624300,431813,142201,2007-09-16,5,So simple and so tasty! I used a yellow capsi...
187037,400708,252013,2008-01-10,4,"Very nice breakfast HH, easy to make and yummy..."
706134,2001852463,404716,2017-12-11,5,These are a favorite for the holidays and so e...
312179,95810,129396,2008-03-14,5,Excellent soup! The tomato flavor is just gre...
...,...,...,...,...,...
1013457,1270706,335534,2009-05-17,4,This recipe was great! I made it last night. I...
158736,2282344,8701,2012-06-03,0,This recipe is outstanding. I followed the rec...
1059834,689540,222001,2008-04-08,5,"Well, we were not a crowd but it was a fabulou..."
453285,2000242659,354979,2015-06-02,5,I have been a steak eater and dedicated BBQ gr...


1.2 Для каждой из таблиц выведите основные параметры:
* количество точек данных (строк);
* количество столбцов;
* тип данных каждого столбца.

In [4]:
recipes.shape

(30000, 8)

In [5]:
recipes.dtypes

name                      object
id                         int64
minutes                    int64
contributor_id             int64
submitted         datetime64[ns]
n_steps                  float64
description               object
n_ingredients            float64
dtype: object

In [6]:
reviews.shape

(126696, 5)

In [7]:
reviews.dtypes

user_id       int64
recipe_id     int64
date         object
rating        int64
review       object
dtype: object

1.3 Исследуйте, в каких столбцах таблиц содержатся пропуски. Посчитайте долю строк, содержащих пропуски, в отношении к общему количеству строк.

In [36]:
recipes.isnull().sum()

name                      0
id                        0
minutes                   0
contributor_id            0
submitted                 0
n_steps               11190
description             623
n_ingredients          8880
description_length      623
name_word_count           0
dtype: int64

In [37]:
nanlines = recipes.isnull().sum(axis=1)
len(nanlines[nanlines>0])/len(nanlines)

0.5684666666666667

In [38]:
reviews.isnull().sum()

user_id       0
recipe_id     0
date          0
rating        0
review       17
dtype: int64

In [39]:
nanlines = reviews.isnull().sum(axis=1)
len(nanlines[nanlines>0])/len(nanlines)

0.00013417945317926376

1.4 Рассчитайте среднее значение для каждого из числовых столбцов (где это имеет смысл).

In [13]:
recipes[['id', 'minutes', 'contributor_id', 'n_steps', 'n_ingredients']].mean()

id                2.218793e+05
minutes           1.233581e+02
contributor_id    5.635901e+06
n_steps           9.805582e+00
n_ingredients     9.008286e+00
dtype: float64

In [14]:
reviews[['user_id', 'recipe_id', 'rating']].mean()

user_id      1.408013e+08
recipe_id    1.600944e+05
rating       4.410802e+00
dtype: float64

1.5 Создайте серию из 10 случайных названий рецептов.

In [15]:
recipes['name'].sample(n=10)

10002    easy five dollar meal aka sausage with beans
2669                 beef stew with barley  crock pot
24252                             simple mocha frappe
14214                      hot leek and artichoke dip
15240                   kato s  grand marnier dessert
5189                               cheeseburger pitas
18697                           no bake lemon squares
14013           honey mustard green beans vinaigrette
26548                                swedish pancakes
23329                   sandy s chicken and dumplings
Name: name, dtype: object

1.6 Измените индекс в таблице `reviews`, пронумеровав строки, начиная с нуля.

In [16]:
reviews.reset_index(drop=True, inplace=True)
reviews

Unnamed: 0,user_id,recipe_id,date,rating,review
0,21752,57993,2003-05-01,5,Last week whole sides of frozen salmon fillet ...
1,431813,142201,2007-09-16,5,So simple and so tasty! I used a yellow capsi...
2,400708,252013,2008-01-10,4,"Very nice breakfast HH, easy to make and yummy..."
3,2001852463,404716,2017-12-11,5,These are a favorite for the holidays and so e...
4,95810,129396,2008-03-14,5,Excellent soup! The tomato flavor is just gre...
...,...,...,...,...,...
126691,1270706,335534,2009-05-17,4,This recipe was great! I made it last night. I...
126692,2282344,8701,2012-06-03,0,This recipe is outstanding. I followed the rec...
126693,689540,222001,2008-04-08,5,"Well, we were not a crowd but it was a fabulou..."
126694,2000242659,354979,2015-06-02,5,I have been a steak eater and dedicated BBQ gr...


1.7 Выведите информацию о рецептах, время выполнения которых не больше 20 минут и кол-во ингредиентов в которых не больше 5.

In [17]:
recipes[(recipes['minutes'] <= 20) & (recipes['n_ingredients'] <= 5)]

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients
28,quick biscuit bread,302399,20,213909,2008-05-06,11.0,this is a wonderful quick bread to make as an ...,5.0
60,peas fit for a king or queen,303944,20,213909,2008-05-16,,this recipe is so simple and the flavors are s...,5.0
90,hawaiian sunrise mimosa,100837,5,58104,2004-09-29,4.0,pineapple mimosa was changed to hawaiian sunri...,3.0
91,tasty dish s banana pudding in 2 minutes,286484,2,47892,2008-02-13,,"""mmmm, i love bananas!"" a --tasty dish-- origi...",4.0
94,1 minute meatballs,11361,13,4470,2001-09-03,,this is a real short cut for cooks in a hurry....,2.0
...,...,...,...,...,...,...,...,...
29873,zip and steam red potatoes with butter and garlic,304922,13,724218,2008-05-27,9.0,"i haven't tried this yet, but i am going to so...",5.0
29874,ziplock vanilla ice cream,74250,10,24386,2003-10-29,8.0,a fun thing for kids to do. may want to use mi...,3.0
29905,zucchini and corn with cheese,256177,15,305531,2007-09-29,4.0,from betty crocker fresh spring recipes. i lik...,5.0
29980,zucchini with jalapeno monterey jack,320622,10,305531,2008-08-20,3.0,simple and yummy!,3.0


### Работа с датами в `pandas`

2.1 Преобразуйте столбец `submitted` из таблицы `recipes` в формат времени. Модифицируйте решение задачи 1.1 так, чтобы считать столбец сразу в нужном формате.

2.2 Выведите информацию о рецептах, добавленных в датасет не позже 2010 года.

In [18]:
recipes[recipes['submitted'] <= np.datetime64('2010-12-31')]

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients
0,george s at the cove black bean soup,44123,90,35193,2002-10-25,,an original recipe created by chef scott meska...,18.0
1,healthy for them yogurt popsicles,67664,10,91970,2003-07-26,,my children and their friends ask for my homem...,
2,i can t believe it s spinach,38798,30,1533,2002-08-29,,"these were so go, it surprised even me.",8.0
3,italian gut busters,35173,45,22724,2002-07-27,,my sister-in-law made these for us at a family...,
4,love is in the air beef fondue sauces,84797,25,4470,2004-02-23,4.0,i think a fondue is a very romantic casual din...,
...,...,...,...,...,...,...,...,...
29993,zuni caf zucchini pickles,316950,2895,62264,2008-07-31,,refrigerator pickles for some of the zucchini ...,8.0
29995,zurie s holey rustic olive and cheddar bread,267661,80,200862,2007-11-25,16.0,this is based on a french recipe but i changed...,10.0
29996,zwetschgenkuchen bavarian plum cake,386977,240,177443,2009-08-24,,"this is a traditional fresh plum cake, thought...",11.0
29997,zwiebelkuchen southwest german onion cake,103312,75,161745,2004-11-03,,this is a traditional late summer early fall s...,


### Работа со строковыми данными в `pandas`

3.1  Добавьте в таблицу `recipes` столбец `description_length`, в котором хранится длина описания рецепта из столбца `description`.

In [19]:
recipes['description_length'] = recipes['description'].str.len()
recipes

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients,description_length
0,george s at the cove black bean soup,44123,90,35193,2002-10-25,,an original recipe created by chef scott meska...,18.0,330.0
1,healthy for them yogurt popsicles,67664,10,91970,2003-07-26,,my children and their friends ask for my homem...,,255.0
2,i can t believe it s spinach,38798,30,1533,2002-08-29,,"these were so go, it surprised even me.",8.0,39.0
3,italian gut busters,35173,45,22724,2002-07-27,,my sister-in-law made these for us at a family...,,154.0
4,love is in the air beef fondue sauces,84797,25,4470,2004-02-23,4.0,i think a fondue is a very romantic casual din...,,587.0
...,...,...,...,...,...,...,...,...,...
29995,zurie s holey rustic olive and cheddar bread,267661,80,200862,2007-11-25,16.0,this is based on a french recipe but i changed...,10.0,484.0
29996,zwetschgenkuchen bavarian plum cake,386977,240,177443,2009-08-24,,"this is a traditional fresh plum cake, thought...",11.0,286.0
29997,zwiebelkuchen southwest german onion cake,103312,75,161745,2004-11-03,,this is a traditional late summer early fall s...,,311.0
29998,zydeco soup,486161,60,227978,2012-08-29,,this is a delicious soup that i originally fou...,,648.0


3.2 Измените название каждого рецепта в таблице `recipes` таким образом, чтобы каждое слово в названии начиналось с прописной буквы.

In [20]:
recipes['name'] = recipes['name'].str.title()
recipes

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients,description_length
0,George S At The Cove Black Bean Soup,44123,90,35193,2002-10-25,,an original recipe created by chef scott meska...,18.0,330.0
1,Healthy For Them Yogurt Popsicles,67664,10,91970,2003-07-26,,my children and their friends ask for my homem...,,255.0
2,I Can T Believe It S Spinach,38798,30,1533,2002-08-29,,"these were so go, it surprised even me.",8.0,39.0
3,Italian Gut Busters,35173,45,22724,2002-07-27,,my sister-in-law made these for us at a family...,,154.0
4,Love Is In The Air Beef Fondue Sauces,84797,25,4470,2004-02-23,4.0,i think a fondue is a very romantic casual din...,,587.0
...,...,...,...,...,...,...,...,...,...
29995,Zurie S Holey Rustic Olive And Cheddar Bread,267661,80,200862,2007-11-25,16.0,this is based on a french recipe but i changed...,10.0,484.0
29996,Zwetschgenkuchen Bavarian Plum Cake,386977,240,177443,2009-08-24,,"this is a traditional fresh plum cake, thought...",11.0,286.0
29997,Zwiebelkuchen Southwest German Onion Cake,103312,75,161745,2004-11-03,,this is a traditional late summer early fall s...,,311.0
29998,Zydeco Soup,486161,60,227978,2012-08-29,,this is a delicious soup that i originally fou...,,648.0


3.3 Добавьте в таблицу `recipes` столбец `name_word_count`, в котором хранится количество слов из названии рецепта (считайте, что слова в названии разделяются только пробелами). Обратите внимание, что между словами может располагаться несколько пробелов подряд.

In [21]:
recipes['name_word_count'] = recipes['name'].str.split().str.len()
recipes

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients,description_length,name_word_count
0,George S At The Cove Black Bean Soup,44123,90,35193,2002-10-25,,an original recipe created by chef scott meska...,18.0,330.0,8
1,Healthy For Them Yogurt Popsicles,67664,10,91970,2003-07-26,,my children and their friends ask for my homem...,,255.0,5
2,I Can T Believe It S Spinach,38798,30,1533,2002-08-29,,"these were so go, it surprised even me.",8.0,39.0,7
3,Italian Gut Busters,35173,45,22724,2002-07-27,,my sister-in-law made these for us at a family...,,154.0,3
4,Love Is In The Air Beef Fondue Sauces,84797,25,4470,2004-02-23,4.0,i think a fondue is a very romantic casual din...,,587.0,8
...,...,...,...,...,...,...,...,...,...,...
29995,Zurie S Holey Rustic Olive And Cheddar Bread,267661,80,200862,2007-11-25,16.0,this is based on a french recipe but i changed...,10.0,484.0,8
29996,Zwetschgenkuchen Bavarian Plum Cake,386977,240,177443,2009-08-24,,"this is a traditional fresh plum cake, thought...",11.0,286.0,4
29997,Zwiebelkuchen Southwest German Onion Cake,103312,75,161745,2004-11-03,,this is a traditional late summer early fall s...,,311.0,5
29998,Zydeco Soup,486161,60,227978,2012-08-29,,this is a delicious soup that i originally fou...,,648.0,2


### Группировки таблиц `pd.DataFrame`

4.1 Посчитайте количество рецептов, представленных каждым из участников (`contributor_id`). Какой участник добавил максимальное кол-во рецептов?

In [22]:
recipes.groupby('contributor_id').size()

contributor_id
1530            5
1533          186
1534           50
1535           40
1538            8
             ... 
2001968497      2
2002059754      1
2002234079      1
2002234259      1
2002247884      1
Length: 8404, dtype: int64

In [23]:
max(recipes.groupby('contributor_id').size())

421

4.2 Посчитайте средний рейтинг к каждому из рецептов. Для скольких рецептов отсутствуют отзывы? Обратите внимание, что отзыв с нулевым рейтингом или не заполненным текстовым описанием не считается отсутствующим.

In [24]:
reviews.groupby('recipe_id')['rating'].mean()

recipe_id
48        1.000000
55        4.750000
66        4.944444
91        4.750000
94        5.000000
            ...   
536547    5.000000
536610    0.000000
536728    4.000000
536729    4.750000
536747    0.000000
Name: rating, Length: 28100, dtype: float64

In [25]:
len(recipes['id']) - len(reviews['recipe_id'].unique())

1900

4.3 Посчитайте количество рецептов с разбивкой по годам создания.

In [26]:
recipes.groupby('submitted').size()

submitted
1999-08-06    1
1999-08-09    4
1999-08-10    3
1999-08-11    3
1999-08-12    4
             ..
2018-07-25    1
2018-07-30    1
2018-07-31    1
2018-08-11    2
2018-08-15    1
Length: 4032, dtype: int64

### Объединение таблиц `pd.DataFrame`

5.1 При помощи объединения таблиц, создайте `DataFrame`, состоящий из четырех столбцов: `id`, `name`, `user_id`, `rating`. Рецепты, на которые не оставлен ни один отзыв, должны отсутствовать в полученной таблице. Подтвердите правильность работы вашего кода, выбрав рецепт, не имеющий отзывов, и попытавшись найти строку, соответствующую этому рецепту, в полученном `DataFrame`.

In [27]:
merged_1 = pd.merge(recipes, reviews, left_on='id', right_on='recipe_id', how='inner')[['id', 'name', 'user_id', 'rating']]
merged_1

Unnamed: 0,id,name,user_id,rating
0,44123,George S At The Cove Black Bean Soup,743566,5
1,44123,George S At The Cove Black Bean Soup,76503,5
2,44123,George S At The Cove Black Bean Soup,34206,5
3,67664,Healthy For Them Yogurt Popsicles,494084,5
4,67664,Healthy For Them Yogurt Popsicles,303445,5
...,...,...,...,...
126691,486161,Zydeco Soup,305531,5
126692,486161,Zydeco Soup,1271506,5
126693,486161,Zydeco Soup,724631,5
126694,486161,Zydeco Soup,133174,5


In [28]:
set(recipes['id']) - set(reviews['recipe_id'].unique())
merged_1[merged_1['id'] == 401411]

Unnamed: 0,id,name,user_id,rating


5.2 При помощи объединения таблиц и группировок, создайте `DataFrame`, состоящий из трех столбцов: `recipe_id`, `name`, `review_count`, где столбец `review_count` содержит кол-во отзывов, оставленных на рецепт `recipe_id`. У рецептов, на которые не оставлен ни один отзыв, в столбце `review_count` должен быть указан 0. Подтвердите правильность работы вашего кода, выбрав рецепт, не имеющий отзывов, и найдя строку, соответствующую этому рецепту, в полученном `DataFrame`.

In [29]:
merged_2 = pd.merge(recipes, reviews, left_on='id', right_on='recipe_id', how='left')[['id', 'recipe_id', 'name']] 
merged_2 = merged_2.groupby(['id', 'name'])['recipe_id'].count().reset_index()
merged_2.columns = ['recipe_id', 'name', 'review_count']
merged_2

Unnamed: 0,recipe_id,name,review_count
0,48,Boston Cream Pie,2
1,55,Betty Crocker S Southwestern Guacamole Dip,4
2,66,Black Coffee Barbecue Sauce,18
3,91,Brown Rice And Vegetable Pilaf,4
4,94,Blueberry Buttertarts,4
...,...,...,...
29995,536547,Cauliflower Ceviche,1
29996,536610,Miracle Home Made Puff Pastry,1
29997,536728,Gluten Free Vegemite,1
29998,536729,Creole Watermelon Feta Salad,4


In [30]:
merged_2[merged_2['recipe_id'] == 401411]

Unnamed: 0,recipe_id,name,review_count
25902,401411,Crunchy Ranch Croutons,0


5.3. Выясните, рецепты, добавленные в каком году, имеют наименьший средний рейтинг?

In [31]:
merged_3 = pd.merge(recipes, reviews, left_on='id', right_on='recipe_id', how='inner')[['id', 'submitted', 'rating']]
merged_3 = merged_3.groupby(['id', 'submitted'])['rating'].mean().reset_index()
merged_3 = merged_3.sort_values(by='rating')
merged_3

Unnamed: 0,id,submitted,rating
28099,536747,2018-08-15,0.0
16800,251838,2007-09-10,0.0
16771,251198,2007-09-05,0.0
3347,48590,2002-12-12,0.0
16696,249917,2007-08-30,0.0
...,...,...,...
18205,277427,2008-01-08,5.0
18208,277466,2008-01-08,5.0
18210,277540,2008-01-08,5.0
18195,277181,2008-01-07,5.0


In [32]:
merged_3[merged_3['rating'] == 0]['submitted'].dt.year

28099    2018
16800    2007
16771    2007
3347     2002
16696    2007
         ... 
20587    2008
23663    2009
23683    2009
1989     2002
20557    2008
Name: submitted, Length: 660, dtype: int64

### Сохранение таблиц `pd.DataFrame`

6.1 Отсортируйте таблицу в порядке убывания величины столбца `name_word_count` и сохраните результаты выполнения заданий 3.1-3.3 в csv файл. 

In [33]:
recipes = recipes.sort_values(by='name_word_count', ascending=False)
recipes.to_csv('data/new_recipes.csv', index=False)
recipes

Unnamed: 0,name,id,minutes,contributor_id,submitted,n_steps,description,n_ingredients,description_length,name_word_count
26223,Subru Uncle S Whole Green Moong Dal I Ll Be Ma...,77188,95,6357,2003-11-21,,my dad and mom quite enjoy this lentil curry. ...,15.0,343.0,15
28083,Tsr Version Of T G I Friday S Black Bean Soup...,102274,75,74652,2004-10-19,9.0,from www.topsecretrecipes.com i got this copyc...,16.0,436.0,14
26222,Subru Uncle S Toor Ki Dal Sindhi Style Dad M...,76908,65,6357,2003-11-18,29.0,this is the lentil curry that subru uncle(our ...,15.0,1087.0,14
27876,Top Secret Recipes Version Of I H O P Griddl...,113346,20,175727,2005-03-14,5.0,this recipe is top secret recipes version of i...,9.0,129.0,14
5734,Chicken Curry Or Cat S Vomit On A Bed Of Magg...,294898,30,802799,2008-03-28,11.0,an old family recipe that's easy to make since...,12.0,144.0,13
...,...,...,...,...,...,...,...,...,...,...
3253,Blackmoons,323195,430,415934,2008-09-04,5.0,my mom was a newlywed in the 1950s when she fo...,,389.0,1
4138,Bushwhacker,156521,10,177392,2006-02-17,1.0,this drink is an excellent after dinner drink ...,6.0,124.0,1
2357,Basbousa,12957,60,18391,2001-10-20,,this is a traditional middle eastern dessert. ...,,78.0,1
15052,Josefinas,264859,20,498271,2007-11-11,7.0,"from the junior league of corpus christi tx, t...",,92.0,1


6.2 Воспользовавшись `pd.ExcelWriter`, cохраните результаты 5.1 и 5.2 в файл: на лист с названием `Рецепты с оценками` сохраните результаты выполнения 5.1; на лист с названием `Количество отзывов по рецептам` сохраните результаты выполнения 5.2.

In [34]:
with pd.ExcelWriter('data/new_recipes.xlsx') as writer:
    merged_1.to_excel(writer, sheet_name='Рецепты с оценками', index=False)
    merged_2.to_excel(writer, sheet_name='Количество отзывов по рецептам', index=False)

#### [версия 2]
* Уточнены формулировки задач 1.1, 3.3, 4.2, 5.1, 5.2, 5.3