# CodeRun β

[link to website](https://coderun.yandex.ru/catalog)

## Analytics track

### Task 259. A users log

We want to build a product recommendation model using user logs. However, user features are scattered across 20 files, each of which contains two columns: user ID and feature value.

Calculate the number of users who have at least 50% of all feature fields filled.

**Input format:**
- All files are sorted by user ID.
- User IDs are unique in the feature files and not unique in the log.

In [293]:
df_log = pd.read_csv('~/algo/coderun_features_files/log.csv',
                     header=None, names=['user_id', 'action'])

In [294]:
df_log = df_log.drop_duplicates(['user_id']).reset_index(drop=True)[['user_id']]

In [295]:
import os
import pandas as pd

# Get a list of CSV files from a folder
path = '/home/nikitin_a/algo/coderun_features_files/'
files = [f for f in os.listdir(path) if f.startswith('feature')]

# Read each file into a dataframe and append it to a dataframe with user_id
cnt = 1
for file in files:
    df_feature = pd.read_csv(path+file, header=None, names=['user_id', f'feature_{cnt}'])
    cnt += 1
    df_log = pd.merge(df_log, df_feature, on='user_id', how='left')

In [296]:
df_log['is_na_sum'] = df_log.isna().sum(axis=1)

In [297]:
df_log[df_log['is_na_sum'] <= 10].user_id.nunique()

870

### Task 219. Travel time

Traveler Vasya is choosing a time to go on a trip to a new country. Vasya believes that the vacation is successful if the temperature in the country rises during the travel time, and the stronger the temperature rises relative to the moment of arrival, the better.

Given a weather forecast for a certain period ahead, you need to indicate the temperature change for the best period for Vasya and the best arrival and departure dates (day numbers). If there are several best options, then indicate the day numbers of the shortest trip with the nearest end date.

**Input format:**

One line of integers separated by a space.

The number of days is not more than 10,000, the temperature is positive everywhere and does not exceed 45 degrees.

**Output format:**

Three numbers separated by a space: the temperature change for the optimal period, the arrival day number, and the departure day number (day numbering starting from 0). It is guaranteed that with the optimal choice of the trip, the temperature will increase.

In [198]:
import sys


def main():
    inp = [int(x) for x in input().split()]

    max_t = 0
    start = 0
    stop = 1
    for i in range(0, len(inp)):
        for j in range(i+1, len(inp)):
            if inp[j] - inp[i] > max_t:
                max_t = inp[j] - inp[i]
                start, stop = i, j        

    print(max_t, start, stop)


if __name__ == '__main__':
    main()

 3 4 1 6


5 2 3


### Task 281. Hair saloon

Nikolay works in a hair salon in a big city. One day, before going to work, he realized that he wouldn't be able to serve all of his N scheduled clients for the day and could only serve K. Since Nikolay can't come to work twice or take a break, he needs K clients to come in a row. Knowing the probabilities that a client will never return to the hair salon after a cancellation, it is necessary to find the minimum expected number of lost clients given the best cancellation scenario.

**Input format:**

The first line contains the number N (1 ≤ N ≤ 1000000). The second line contains the number K (0 ≤ K ≤ N). The next N lines contain a fractional number between 0 and 1. The i-th number represents the probability that the i-th client will not return after a cancellation.

**Output format:**

A fractional number - the minimum expected number of lost clients.

In [162]:
def main():
    n = int(input())
    k = int(input())
    clients = [float(input()) for _ in range(n)]

    if k == 0:
        return sum(clients)
    elif k == n:
        return 0
    elif k == 1:
        return sum(clients) - max(clients)
    else:
        dp = [0] * n
        dp[0] = clients[0]
        for i in range(1, n):
            dp[i] = dp[i-1] + clients[i]
            if i >= k:
                dp[i] -= clients[i-k]
        return sum(clients) - max(dp)


if __name__ == '__main__':
    print(main())

 5
 3
 0.668103249992
 0.525906286805
 0.0793836313371
 0.986652106472
 0.010960416731


0.6790636667229997


### Task 339. Greg's Parties

Greg loves parties. Every week, he randomly selects a non-empty subset of all his friends in his messenger, creates a chat and invites all participants to have fun. Greg knows that the probability of a particular chat participant attending the party is `x/(x^2-5x+10)`, where x is the number of people in the subset. The decisions of the participants are independent. The party is considered awesome if there are at least two people present (including Greg, who always comes).

What is the probability that all of Greg's parties will be awesome over the course of a year (52 weeks) if Greg maximizes their number?

**Output format**

A floating-point number with three digits after the decimal point (0.000).

In [65]:
weeks = 52
max_awesomenes = 0
for x in range(2, 11):
    p_one_come = x / (x**2 - 5*x + 10)
    p_no_show = (1 - p_one_come)**x
    p_awesome = 1 - p_no_show
    p_all_awesome = p_awesome ** weeks
    if p_all_awesome > max_awesomenes:
        max_awesomenes = p_all_awesome

print(round(max_awesomenes, 3))        

0.524


### Task 189. Dj Pasha

Pasha is preparing for a party and creating a playlist of hip-hop music. He doesn't want to use common tracks and is looking for something new that he hasn't heard before.

Pasha turned on a stream of unknown tracks on Yandex Music and is selecting a playlist. He calculated that an unknown track becomes his favorite with a probability of 20%.

And among his favorites, on average, every third track is in the hip-hop genre.

Pasha is trying to figure out how much time he needs to allocate to create a cool hip-hop playlist. Help him calculate how much time, on average, he needs to spend to hear one favorite hip-hop track if on average, one musical track lasts 2 minutes 45 seconds.

**Output Format**

The average amount of time needed to spend to listen to one favorite hip-hop track in seconds.

In [33]:
import random
import numpy as np


trials = 100000
avg_track = (2 * 60) + 45
genres = ['hip-hop', 'rock', 'pop']

results = []
for _ in range(trials):
    tracks = ['liked'] * 2000 + ['skip'] * 8000
    random.shuffle(tracks)
    time_avg = 0
    genre = False
    while genre != 'hip-hop':
        playing_now = tracks.pop()
        time_avg += avg_track
        if playing_now == 'liked':
            genre = random.choice(genres)
    results.append(time_avg)

mu = np.mean(results)    
se = np.std(results) / len(results)
d = 1.96 * se


print(f'An average time to find a liked hip-hop track: {mu:.2f} ± {d:.2f} seconds.')

An average time to find a liked hip-hop track: 2475.74 ± 0.05 seconds.


### Task 322. Deck of cards

From a well-shuffled full deck of cards (from twos to aces, 4 suits, a total of 52 cards), you are dealt 6 cards in a row. What is the probability that the sum of the dealt cards will be 21 points? Assume that the jack is worth 11 points, the queen is worth 12 points, the king is worth 13 points, and the ace is worth 14 points. The point value for the other cards coincides with their nominal value.

Round your answer to six decimal places.

**Output format:**

A number from 0 to 1 with six decimal places. Example of the answer format: 0.123456

In [1]:
import random


trials = 1000000
sum_ = 21
cards = [2] * 4 + [3] * 4 + \
        [4] * 4 + [5] * 4 + \
        [6] * 4 + [7] * 4 + \
        [8] * 4 + [9] * 4 + \
        [10] * 4 + [11] * 4 + \
        [12] * 4 + [13] * 4 + \
        [14] * 4

cnt = 0
for trial in range(trials):
    random.shuffle(cards)
    ans = sum(random.choices(cards, k=6))
    if ans == sum_:
        cnt += 1

print(cnt/trials)

0.000435


### Task 191. The most difficult letter

Vasily decided to improve his typing speed. He noticed that he spends more time looking for some letters on the keyboard than others. He wonders which letter he searched for the longest. Write a program that will help Vasily find out.

Vasily entered N letters. The string S, entered by Vasily, has a length of N.

The array A contains N non-negative integers, each number Ai is the time in milliseconds from the beginning of input until the i-th letter was typed.

It is assumed that Vasily started looking for the next letter immediately after typing the previous one. He was looking for the letter with the index 0 for A0 milliseconds.

When solving in the Dart language, use input and output through files, the standard input stream works too slowly.

**Input format**

The first line of input contains N - the number of letters entered.

The second line contains S - the entered string consisting of N letters.

The third line contains A - N non-negative integers separated by spaces.

**Constraints**

0 < N < 10^6
0 ≤ Ai < 10^8
The array A is sorted in ascending order:

**Output format**

Output the letter that Vasily searched for the longest. If there are several letters with the same search time, output the one that he typed last.

In [2]:
n = int(input())
letters = input()
seconds = list(map(int, input().split()))

max_time = 0
max_idx = 0
for i in range(n):
    if i == 0:
        ans = seconds[i] - 0
        if ans > max_time:
            max_idx = i
            max_time = ans
    else:
        ans = seconds[i] - seconds[i-1]
        if ans >= max_time:
            max_idx = i
            max_time = ans

print(letters[max_idx])

 3
 adc
 1 7 5


d
