--- Day 2: Gift Shop ---
You get inside and take the elevator to its only other stop: the gift shop. "Thank you for visiting the North Pole!" gleefully exclaims a nearby sign. You aren't sure who is even allowed to visit the North Pole, but you know you can access the lobby through here, and from there you can access the rest of the North Pole base.

As you make your way through the surprisingly extensive selection, one of the clerks recognizes you and asks for your help.

As it turns out, one of the younger Elves was playing on a gift shop computer and managed to add a whole bunch of invalid product IDs to their gift shop database! Surely, it would be no trouble for you to identify the invalid product IDs for them, right?

They've even checked most of the product ID ranges already; they only have a few product ID ranges (your puzzle input) that you'll need to check. For example:

11-22,95-115,998-1012,1188511880-1188511890,222220-222224,
1698522-1698528,446443-446449,38593856-38593862,565653-565659,
824824821-824824827,2121212118-2121212124
(The ID ranges are wrapped here for legibility; in your input, they appear on a single long line.)

The ranges are separated by commas (,); each range gives its first ID and last ID separated by a dash (-).

Since the young Elf was just doing silly patterns, you can find the invalid IDs by looking for any ID which is made only of some sequence of digits repeated twice. So, 55 (5 twice), 6464 (64 twice), and 123123 (123 twice) would all be invalid IDs.

None of the numbers have leading zeroes; 0101 isn't an ID at all. (101 is a valid ID that you would ignore.)

Your job is to find all of the invalid IDs that appear in the given ranges. In the above example:

11-22 has two invalid IDs, 11 and 22.
95-115 has one invalid ID, 99.
998-1012 has one invalid ID, 1010.
1188511880-1188511890 has one invalid ID, 1188511885.
222220-222224 has one invalid ID, 222222.
1698522-1698528 contains no invalid IDs.
446443-446449 has one invalid ID, 446446.
38593856-38593862 has one invalid ID, 38593859.
The rest of the ranges contain no invalid IDs.
Adding up all the invalid IDs in this example produces 1227775554.

What do you get if you add up all of the invalid IDs?

In [None]:
from aocd import get_data, submit

In [None]:
data = get_data(year = 2025, day = 2)

In [None]:
data_ls = data.split(",")
data_ls[:5]

['1061119-1154492', '3-23', '5180469-5306947', '21571-38630', '1054-2693']

In [None]:
rng = data_ls[1]
rng

'3-23'

In [None]:
l, r = rng.split("-")
l, r = int(l), int(r)

list(range(l, r+1))

[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]

Okay, I think this is a good starting point (maybe full iteration isn't neccesary but I'm happy with an approach that isn't fully efficient).

how do I count repition? I'm trying to think of a good way to do this without doing an iteration of all numbers before and checking repition, which will be super inefficient. 

##### ðŸ¤–ReplyðŸ¤–<!-- SOLVEIT_SEPARATOR_7f3a9b2c -->

ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ 

Good progress! Here's a hint: you don't need to check against other numbers at all.

Think about it as a string problem. For a number like `123123`, what's special about the string `"123123"`? What's the relationship between the first half and the second half?ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ ðŸ§ 

Great question! The key insight: if a number is "doubled" like `6464` or `123123`, then its string representation has **even length** and the **first half equals the second half**.

So for any number `n`:
```python
s = str(n)
len(s) % 2 == 0 and s[:len(s)//2] == s[len(s)//2:]
```

Try this check on a few examples like `6464`, `123123`, and `123` to see if it works!

Good point, I don't actually need to treat as ints at all. 

In [None]:
def pt1(data_ls):
    invalid_ids = []
    for rng in data_ls:
        l, r = rng.split("-")
        
        for i in range(int(l), int(r)+1):
            s = str(i) 
            if len(s) % 2 == 0:
                if s[:len(s)//2] == s[len(s)//2:]:
                    invalid_ids.append(int(s))

    sum_ = sum(invalid_ids)

    return invalid_ids, sum_

In [None]:
example_ls = """11-22,95-115,998-1012,1188511880-1188511890,222220-222224,
1698522-1698528,446443-446449,38593856-38593862,565653-565659,
824824821-824824827,2121212118-2121212124""".split(",")

In [None]:
pt1(example_ls)

([11, 22, 99, 1010, 1188511885, 222222, 446446, 38593859], 1227775554)

In [None]:
_, pt1_answer = pt1(data_ls)

In [None]:
submit(answer = pt1_answer, part = "a", year = 2025, day = 2)

[32mThat's the right answer!  You are one gold star closer to decorating the North Pole. [Continue to Part Two][0m


<urllib3.response.HTTPResponse at 0x72c38b27cc10>

--- Part Two ---
The clerk quickly discovers that there are still invalid IDs in the ranges in your list. Maybe the young Elf was doing other silly patterns as well?

Now, an ID is invalid if it is made only of some sequence of digits repeated at least twice. So, 12341234 (1234 two times), 123123123 (123 three times), 1212121212 (12 five times), and 1111111 (1 seven times) are all invalid IDs.

From the same example as before:

11-22 still has two invalid IDs, 11 and 22.
95-115 now has two invalid IDs, 99 and 111.
998-1012 now has two invalid IDs, 999 and 1010.
1188511880-1188511890 still has one invalid ID, 1188511885.
222220-222224 still has one invalid ID, 222222.
1698522-1698528 still contains no invalid IDs.
446443-446449 still has one invalid ID, 446446.
38593856-38593862 still has one invalid ID, 38593859.
565653-565659 now has one invalid ID, 565656.
824824821-824824827 now has one invalid ID, 824824824.
2121212118-2121212124 now has one invalid ID, 2121212121.
Adding up all the invalid IDs in this example produces 4174379265.

What do you get if you add up all of the invalid IDs using these new rules?

This is the next part. 

I'm trying to think how to adapt the 2 way search to work for all valid combinations.

##### ðŸ¤–ReplyðŸ¤–<!-- SOLVEIT_SEPARATOR_7f3a9b2c -->

Good question! Think about what divisors of the string length would work.

For example, if a string has length 6, it could be:
- A pattern of length 3 repeated 2 times
- A pattern of length 2 repeated 3 times
- A pattern of length 1 repeated 6 times

What property do these pattern lengths (3, 2, 1) have in relation to 6?

Right, so I think basically a loop from 1 to len(s)//2

In [None]:
def pt2(data_ls):
    invalid_ids = []
    for rng in data_ls:
        l, r = rng.split("-")
        
        for i in range(int(l), int(r)+1):
            s = str(i)
            len_s = len(s)
            max_repeat = len_s // 2

            for i in range(1, max_repeat):
                if len_s % i == 0:
                    num_repeat = int(len_s / i)

                    matching = True 
                    for rep in range(num_repeat):
                        matching = matching & (s[(i*rep):(i*(rep+1))] == s[(i*(rep+1)):(i*(rep+2))])
                        if not matching:
                            break

                    if matching:
                        invalid_ids.append(int(s))

    sum_ = sum(invalid_ids)

    return invalid_ids, sum_

In [None]:
pt2(example_ls)

([], 0)

what am I doing wrong above?

##### ðŸ¤–ReplyðŸ¤–<!-- SOLVEIT_SEPARATOR_7f3a9b2c -->

Your logic for checking repetitions is comparing consecutive chunks, but you should be comparing all chunks to the *first* chunk.

For example, with `"123123"` and pattern length 3:
- You want to check if `s[0:3]` == `s[3:6]`

But your code compares `s[0:3]` to `s[3:6]`, then `s[3:6]` to `s[6:9]` (which doesn't exist).

Try simplifying: extract the pattern once, then check if repeating it gives you back the original string. What would that look like?

In [None]:
rng = example_ls[2]

In [None]:
i = 999
s = str(i)
len_s = len(s)
max_repeat = len_s // 2

In [None]:
for patt_len in range(1, max_repeat+1):
    if len_s % patt_len == 0:
        num_repeat = int(len_s / patt_len)

        matching = True 
        for rep in range(num_repeat-1):
            matching = matching & (s[(patt_len*rep):(patt_len*(rep+1))] == s[(patt_len*(rep+1)):(patt_len*(rep+2))])
            if not matching:
                break

True
True


In [None]:
matching

True

In [None]:
def pt2(data_ls):
    invalid_ids = []
    for rng in data_ls:
        l, r = rng.split("-")
        
        for i in range(int(l), int(r)+1):
            s = str(i)
            len_s = len(s)
            max_repeat = len_s // 2

            for patt_len in range(1, max_repeat+1):
                if len_s % patt_len == 0:
                    num_repeat = int(len_s / patt_len)

                    matching = True 
                    for rep in range(num_repeat-1):
                        matching = matching & (s[(patt_len*rep):(patt_len*(rep+1))] == s[(patt_len*(rep+1)):(patt_len*(rep+2))])
                        if not matching:
                            break
                            
                    if matching:
                        invalid_ids.append(int(s))

    sum_ = sum(invalid_ids)

    return invalid_ids, sum_

In [None]:
pt2(example_ls)

([11,
  22,
  99,
  111,
  999,
  1010,
  1188511885,
  222222,
  222222,
  222222,
  446446,
  38593859,
  565656,
  824824824,
  2121212121],
 4174823709)

hmm, so I think this is basically working now other than that it counts 222222 multiple times for different pattern lengths, where it should only count once. 

I can see that this is happening because it loops over pattern length and appends on any match. I suppose I can break early if true? 

In [None]:
def pt2(data_ls):
    invalid_ids = []
    for rng in data_ls:
        l, r = rng.split("-")
        
        for i in range(int(l), int(r)+1):
            s = str(i)
            len_s = len(s)
            max_repeat = len_s // 2

            for patt_len in range(1, max_repeat+1):
                if len_s % patt_len == 0:
                    num_repeat = int(len_s / patt_len)

                    matching = True 
                    for rep in range(num_repeat-1):
                        matching = matching & (s[(patt_len*rep):(patt_len*(rep+1))] == s[(patt_len*(rep+1)):(patt_len*(rep+2))])
                        if not matching:
                            break
                            
                    if matching:
                        invalid_ids.append(int(s))
                        break

    sum_ = sum(invalid_ids)

    return invalid_ids, sum_

In [None]:
pt2(example_ls)

([11,
  22,
  99,
  111,
  999,
  1010,
  1188511885,
  222222,
  446446,
  38593859,
  565656,
  824824824,
  2121212121],
 4174379265)

very nice

In [None]:
_, pt2_answer = pt2(data_ls)

In [None]:
pt2_answer

24774350322

In [None]:
submit(pt2_answer, part = "b", year = 2025, day = 2)

[32mThat's the right answer!  You are one gold star closer to decorating the North Pole.You have completed Day 2! You can [Shareon
  Bluesky
Twitter
Mastodon] this victory or [Return to Your Advent Calendar].[0m


<urllib3.response.HTTPResponse at 0x72c3605ec940>