In [None]:
import torch
import numpy as np


'''
3的倍数说fizz,5的倍数说buzz,15的倍数fizzbuzz
'''


The code imports the PyTorch and NumPy libraries using the `import` statement. The code also contains a multi-line comment that explains the rules of the FizzBuzz game: "Say 'fizz' for multiples of 3, 'buzz' for multiples of 5, and 'fizzbuzz' for multiples of 15."


In [None]:

# 人工定义的规则代码
def fizz_buzz_encode(i):
    if i % 15 == 0:
        return 3
    elif i % 5 == 0:
        return 2
    elif i % 3 == 0:
        return 1
    else:
        return 0


def fizz_buzz_decode(i, prediction):
    return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]


def helper(i):
    print(fizz_buzz_decode(i, fizz_buzz_encode(i)))


for i in range(1, 16):
    # helper(i) #暂时注释用不上

    NUM_DIGITS = 10


def binary_encode(i, num_digits):
    return np.array([i >> d & 1 for d in range(num_digits)][::-1])


The code defines three functions: `fizz_buzz_encode()`, `fizz_buzz_decode()`, and `helper()`.

The `fizz_buzz_encode()` function takes an integer `i` as input and returns an integer that encodes the FizzBuzz game rules. If `i` is divisible by 15, the function returns 3. If `i` is divisible by 5, the function returns 2. If `i` is divisible by 3, the function returns 1. Otherwise, the function returns 0.

The `fizz_buzz_decode()` function takes an integer `i` and an integer `prediction` as input and returns a string that decodes the FizzBuzz game rules. If `prediction` is 0, the function returns the string representation of `i`. If `prediction` is 1, the function returns the string "fizz". If `prediction` is 2, the function returns the string "buzz". If `prediction` is 3, the function returns the string "fizzbuzz".

The `helper()` function takes an integer `i` as input and prints the decoded FizzBuzz game rules for that integer using the `fizz_buzz_encode()` and `fizz_buzz_decode()` functions.

The code also defines a `binary_encode()` function that takes an integer `i` and a number of digits `num_digits` as input and returns a NumPy array that represents the binary encoding of `i` with `num_digits` digits.


In [None]:

trX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(101, 2 ** NUM_DIGITS)])
trY = torch.LongTensor([fizz_buzz_encode(i) for i in range(101, 2 ** NUM_DIGITS)])
print(trX.shape)

'''
用pytorch定义模型
'''
NUM_HIDDEN = 100
model = torch.nn.Sequential(
    torch.nn.Linear(NUM_DIGITS, NUM_HIDDEN),
    torch.nn.ReLU(),
    torch.nn.Linear(NUM_HIDDEN, 4)  # 4 logits, after softmax, we get a probability distribution
)
if torch.cuda.is_available():
    model = model.cuda()

loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)

The code defines a PyTorch tensor `trX` that contains the binary encoding of the integers from 101 to 2^NUM_DIGITS - 1. The binary encoding is obtained using the `binary_encode()` function defined earlier.

The code also defines a PyTorch tensor `trY` that contains the encoded FizzBuzz game rules for the integers from 101 to 2^NUM_DIGITS - 1. The encoded rules are obtained using the `fizz_buzz_encode()` function defined earlier.

The code then defines a PyTorch neural network model using the `torch.nn.Sequential()` function. The model consists of three layers: a linear layer with `NUM_DIGITS` input features and `NUM_HIDDEN` output features, a ReLU activation function, and a linear layer with `NUM_HIDDEN` input features and 4 output features. The 4 output features correspond to the 4 possible FizzBuzz game rules (0, 1, 2, or 3).

The code checks if a GPU is available and moves the model to the GPU if it is.

The code defines a cross-entropy loss function and a stochastic gradient descent optimizer for the model.


In [None]:

BATH_SIZE = 128
for epoch in range(10000):
    for start in range(0, len(trX), BATH_SIZE):
        end = start + BATH_SIZE
        batchX = trX[start:end]
        batchY = trY[start:end]

        if torch.cuda.is_available():
            batchX = batchX.cuda()
            batchY = batchY.cuda()

        y_pred = model(batchX)  # forward
        loss = loss_fn(y_pred, batchY)

        print("epoch", epoch, loss.item())

        optimizer.zero_grad()
        loss.backward()  # back pass
        optimizer.step()  # gradient descent

The code defines a batch size of 128 using the variable `BATH_SIZE`. The code then trains the PyTorch neural network model using a nested loop. The outer loop iterates over 10000 epochs, and the inner loop iterates over the training data in batches of size `BATH_SIZE`.

For each batch, the code selects a subset of the training data using the `start` and `end` indices. The input data and target labels for the batch are stored in the variables `batchX` and `batchY`, respectively. If a GPU is available, the data is moved to the GPU using the `cuda()` method.

The code then performs a forward pass through the neural network model using the input data `batchX`. The predicted output is stored in the variable `y_pred`. The code calculates the loss between the predicted output and the target labels using the cross-entropy loss function defined earlier.

The code prints the epoch number and the current loss value for each batch. The code then sets the gradients of all model parameters to zero using the `zero_grad()` method, performs a backward pass through the model using the `backward()` method, and updates the model parameters using the gradient descent optimizer defined earlier with the `step()` method.


In [None]:

testX = torch.Tensor([binary_encode(i, NUM_DIGITS) for i in range(1, 101)])
if torch.cuda.is_available():
    testX = testX.cuda()
with torch.no_grad():
    testY = model(testX)

predictions = zip(range(1, 101), testY.max(1)[1].cpu().data.tolist())

print([fizz_buzz_decode(i, x) for i, x in predictions])

The code defines a PyTorch tensor `testX` that contains the binary encoding of the integers from 1 to 100. The binary encoding is obtained using the `binary_encode()` function defined earlier.

The code checks if a GPU is available and moves the `testX` tensor to the GPU if it is.

The code then uses the trained neural network model to make predictions for the input data `testX`. The `torch.no_grad()` context manager is used to disable gradient calculation during the forward pass, which reduces memory usage and speeds up computation.
