<a href="https://colab.research.google.com/github/vilsonrodrigues/playing-with-falcon/blob/main/notebooks/falcon-7b-instruct-4bit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

References: https://huggingface.co/blog/4bit-transformers-bitsandbytes

### Install dependencies

In [None]:
!pip install -q -U bitsandbytes
!pip install -q -U git+https://github.com/huggingface/transformers.git
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git
!pip install -q -U einops
!pip install -q -U safetensors

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m101.8/101.8 MB[0m [31m10.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m268.8/268.8 kB[0m [31m16.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.8/7.8 MB[0m [31m107.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m75.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for transformers (pyproject.toml) ... [?25l[?25hdone
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.6/227.6 kB[0m [31m19.4 MB

### bitsandbytes configs

The 4bit integration comes with 2 different quantization types: FP4 and NF4. The NF4 dtype stands for Normal Float 4 and is introduced in the QLoRA paper

You can switch between these two dtype using bnb_4bit_quant_type from BitsAndBytesConfig. By default, the FP4 quantization is used.

This saves more memory at no additional performance - from our empirical observations, this enables fine-tuning llama-13b model on an NVIDIA-T4 16GB with a sequence length of 1024, batch size of 1 and gradient accumulation steps of 4.

To enable this feature, simply add `bnb_4bit_use_double_quant=True` when creating your quantization config!

(text from HF colab)



We will used NF4!

In [None]:
import torch
from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

### Load model and pipeline

In [None]:
# My version with smaller chunks on safetensors for low RAM environments
model_id = "vilsonrodrigues/falcon-7b-instruct-sharded"

from transformers import AutoModelForCausalLM, AutoTokenizer,pipeline

In [None]:
model_4bit = AutoModelForCausalLM.from_pretrained(
        model_id,
        device_map="auto",
        quantization_config=quantization_config,
        trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Downloading (…)lve/main/config.json:   0%|          | 0.00/773 [00:00<?, ?B/s]

Downloading (…)/configuration_RW.py:   0%|          | 0.00/2.61k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/tiiuae/falcon-7b-instruct:
- configuration_RW.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


Downloading (…)main/modelling_RW.py:   0%|          | 0.00/47.5k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/tiiuae/falcon-7b-instruct:
- modelling_RW.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


Downloading (…)fetensors.index.json:   0%|          | 0.00/16.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/15 [00:00<?, ?it/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.68G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.82G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.82G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.82G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.82G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/1.99G [00:00<?, ?B/s]

Downloading (…)of-00015.safetensors:   0%|          | 0.00/828M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/15 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/180 [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/2.73M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/281 [00:00<?, ?B/s]

In [None]:
print(model_4bit)

RWForCausalLM(
  (transformer): RWModel(
    (word_embeddings): Embedding(65024, 4544)
    (h): ModuleList(
      (0-31): 32 x DecoderLayer(
        (input_layernorm): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
        (self_attention): Attention(
          (maybe_rotary): RotaryEmbedding()
          (query_key_value): Linear4bit(in_features=4544, out_features=4672, bias=False)
          (dense): Linear4bit(in_features=4544, out_features=4544, bias=False)
          (attention_dropout): Dropout(p=0.0, inplace=False)
        )
        (mlp): MLP(
          (dense_h_to_4h): Linear4bit(in_features=4544, out_features=18176, bias=False)
          (act): GELU(approximate='none')
          (dense_4h_to_h): Linear4bit(in_features=18176, out_features=4544, bias=False)
        )
      )
    )
    (ln_f): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=4544, out_features=65024, bias=False)
)


In [None]:
!pip install xformers

Collecting xformers
  Downloading xformers-0.0.20-cp310-cp310-manylinux2014_x86_64.whl (109.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m109.1/109.1 MB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
Collecting pyre-extensions==0.0.29 (from xformers)
  Downloading pyre_extensions-0.0.29-py3-none-any.whl (12 kB)
Collecting typing-inspect (from pyre-extensions==0.0.29->xformers)
  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Collecting mypy-extensions>=0.3.0 (from typing-inspect->pyre-extensions==0.0.29->xformers)
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Installing collected packages: mypy-extensions, typing-inspect, pyre-extensions, xformers
Successfully installed mypy-extensions-1.0.0 pyre-extensions-0.0.29 typing-inspect-0.9.0 xformers-0.0.20


In [None]:
import torch
import transformers

pipeline = transformers.pipeline(
        "text-generation",
        model=model_4bit,
        tokenizer=tokenizer,
        use_cache=True,
        device_map="auto",
        max_length=296,
        do_sample=True,
        top_k=10,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.eos_token_id,
)

In [None]:
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:")



In [None]:
sequences

[{'generated_text': "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron: Hello, Daniel! How are you doing?\nDaniel: I'm doing well. I have an exciting project planned!\nGirafatron: Oh? Tell me more about it!\nDaniel: It's an exhibition of my photographs! I want to put them up in the park so everyone can enjoy them.\nGirafatron: That's a fantastic idea! Have you decided where in the park you want to hang them?\nDaniel: Yes, I have! I found the perfect spot near a little pond. I think the giraffes would love to drink from it, and they'd make for an amazing display!\nGirafatron: That sounds amazing! I can't wait to see your photographs on display!\nDaniel: Thanks, Girafatron! I can't wait to see them up on that pedestal!"}]

### Use with LangChain

In [None]:
# Some error in colab. fix with
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
!pip install langchain

Collecting langchain
  Downloading langchain-0.0.229-py3-none-any.whl (1.3 MB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.3 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m50.2 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.6.0,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.5.9-py3-none-any.whl (26 kB)
Collecting langchainplus-sdk<0.0.21,>=0.0.20 (from langchain)
  Downloading langchainplus_sdk-0.0.20-py3-none-any.whl (25 kB)
Collecting openapi-schema-pydantic<2.0,>=1.2 (from langchain)
  Downloading openapi_schema_pydantic-1.2.4-py3-none-any.whl (90 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m90.0/90.0 kB[0m [31m13.5 MB/s[0m eta [36m0:00:00[0m
Collecting marshmallow<4.0.0,>=3.3.0 (from dataclasses-json<0.6.0,>=0.5.7->langchain)
  Downloading marshmallow-3.19.0-py3-none-any.whl (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
from langchain import HuggingFacePipeline

Load local LLM

In [None]:
llm = HuggingFacePipeline(pipeline=pipeline)

Define Template

In [None]:
from langchain import PromptTemplate, LLMChain

template = """Question: {question}
Answer: Let's think step by step."""

prompt = PromptTemplate(
    template=template,
    input_variables= ["question"]
)

Chain

In [None]:
llm_chain = LLMChain(prompt=prompt, llm=llm)

### **Tests**

In [None]:
llm_chain("How to prepare eggs?")

{'question': 'How to prepare eggs?',
 'text': " First of all, you need to make sure that the eggs you use are fresh and of the best quality possible (the fresher and fresher the better). To prepare the eggs you can boil, poach or scramble them. Here's a basic recipe for poaching, you can also experiment with different types of seasoning:\n1. Fill a bowl with warm water.\n2. Add a tablespoon of white vinegar or a tablespoon of baking soda for each cup of water you use.\n3. Add the desired spices or seasoning.\n4. Crack the egg into the center of the bowl, and let it float.\n5. Let it sit in the bowl for a minute or two to allow the whites and yolk to solidify.\n6. Gently slide the egg into the water and let it cook until the desired temperature (you can check this by using a thermometer).\n7. Once it's cooked, take it out with a fork or spoon and drain the water.\n8. Let it cool before serving to prevent cracking. You can serve the egg with a variety of things, such as toast, a salad, a

In [None]:
llm_chain("How to start a car?")

{'question': 'How to start a car?',
 'text': " First make sure that you're in the car, in the driver's seat, in a safe environment. Then start the engine by turning the key, pressing the pedal, or pressing your foot on the brake. The engine should then turn on, the car should vibrate, the radio should come on, the lights should turn on and off, and any other electronics should activate. Finally, you should drive off and start the engine."}

In [None]:
template2 = """Question: /n {question}. Answer: """

prompt2 = PromptTemplate(
    template=template2,
    input_variables= ["question"]
)

In [None]:
llm_chain_2 = LLMChain(prompt=prompt2, llm=llm)

In [None]:
result_explanation = llm_chain_2("Explain antibiotics")

In [None]:
result_explanation['text']

'/n\nAntibiotics are a group of drugs that fight bacteria in our body. They work by interfering with bacterial DNA, preventing the bacteria from growing and dividing. They are used to treat a variety of bacterial infections, including strep throat, urinary tract infections, and pneumonia.'

In [None]:
"""Author-contribution statements and acknowledgements in research papers should state clearly and specifically whether, and to what extent, the authors used AI technologies such as ChatGPT in the preparation of their manuscript and analysis. They should also indicate which LLMs were used. This will alert editors and reviewers to scrutinize manuscripts more carefully for potential biases, inaccuracies and improper source crediting. Likewise, scientific journals should be transparent about their use of LLMs, for example when selecting submitted manuscripts.
Mention the large language model based product mentioned in the paragraph above:"""

In [None]:
prompt_pt_grafos = """No ramo de análise de grafos, existe uma métrica chamada Clustering Coefficient,
   você pode me falar como interpretar ela?"""
llm_chain_2(prompt_pt_grafos)



{'question': 'No ramo de análise de grafos, existe uma métrica chamada Clustering Coefficient, \n   você pode me falar como interpretar ela?',
 'text': "/n Clustering Coefficient is a metric used in clustering algorithms to quantify the level of 'fidelity' or 'homogeneity' between groups of items. It is commonly used to identify overlapping groups and clusters in a dataset. It ranges from 1 (high) to -1 (low). The closer the value to 1, the more similar the groups and clusters are. Clustering coefficients range from 1 to 4, with 4 indicating that the dataset can be divided into four distinct groups."}

In [None]:
llm_chain_2("what is a convolution?")

{'question': 'what is a convolution?',
 'text': '/n a convolution, in the context of computer graphics, is when an image is processed in multiple dimensions, typically to blur or sharpen the image. This processing can involve convolution of pixels in the image. The term is derived from the mathematical term convolution, which is a mathematical operation that involves overlapping of two functions of the same shape (i.e., the convolution of functions of the shape). The result of the convolution operation is the resulting shape. In computer graphics, the process of convolution is performed to enhance the quality of the rendered image. This is usually achieved by using a mask to filter an image and then convolving an image with the mask. This can be done in real-time on the graphics processing unit (GPU). A convolution can involve a single dimension or multiple dimensions. If there is only one dimension, it is called 1D convolution. If two dimensions, it is called 2D convolution. If more t

In [None]:
prompt_code = """ I have to pass 2 values that are as a string

'2,3'

And turn them into a tuple

(2,3), how to do?
"""

(hit token limit)

In [None]:
llm_chain_2(prompt_code)



{'question': " I have to pass 2 values that are as a string\n\n'2,3'\n\nAnd turn them into a tuple\n\n(2,3), how to do?\n",
 'text': '\nOne way to convert the string into a tuple using a list:\n\n- "2,3" is a string,\n- "2,3" is a pair consisting of a string and a number,\n- Therefore, you can convert the string into a list of a pair of strings, then into a tuple.\n- Here\'s an example of how you can do it:\n\n```python\nstring = "2,3"\nstring_list = list(string)\nlist_of_pairs = string_list\npairs = list(list_of_pairs)\n\n```\n\nNow `pairs` is a list containing strings like `"2,3"` and also has a pair with a string and an integer like `("2,3")`.\nTo convert this list of strings into a tuple, you can create an anonymous function to create the tuple:\n\n```python\npairs = list(",".join(s"("%d", i+1)%s%d" for d,i,i+1 in zip(string_list, string_list[i+1], int'}

In [None]:
prompt_code2 = """
/*
Write a python code to ask the user for their name and say "Hello"
*/
"""
llm_chain_2(prompt_code2)

{'question': '\n/*\nWrite a python code to ask the user for their name and say "Hello"\n*/\n',
 'text': '\nname = input( "What is your name? " )\nprint("Hello!")\nUser '}

In [None]:
llm_chain_2("How to convert a base64 file to bytes in python?")



{'question': 'How to convert a base64 file to bytes in python?',
 'text': "/n \nYou can convert a base64 file to bytes in python by using the 'open()' function to read the file as binary bytes, and then converting these binary bytes to strings using the 'decode()' method. Here's an example: \n\n```\nfile = open('file.b64')\nbytes = file.read()\nfile.close()\nbase64 = ''.join(bytes.decode('utf-8'))\n``` \n\nIn this example, the file `file.b64` is opened using the `open()` function and read as binary bytes using the `read()` method. The resulting binary bytes are then converted to strings using the `decode('utf-8')` method, which converts the binary bytes to a Python strings using the `utf-8` encoding. \n\nFinally, the resulting binary string is joined together using the `join()` method and a blank string to create a base64 encoded string."}

In [None]:
prompt_sql = """
Table departments, columns = [DepartmentId, DepartmentName]
Table students, columns = [DepartmentId, StudentId, StudentName]
Create a MySQL query for all students in the Computer Science Department
"""
llm_chain_2(prompt_sql)



{'question': '\nTable departments, columns = [DepartmentId, DepartmentName]\nTable students, columns = [DepartmentId, StudentId, StudentName]\nCreate a MySQL query for all students in the Computer Science Department\n',
 'text': '\nThe MySQL query for this would be: \n\n```\nSELECT students.StudentId, students.StudentName \nFROM students\nWHERE departments.DepartmentId = yourDepartmentId\n``` \n\nTo select all students in the Computer Science department, you can use a similar query. The `.` character is used as the wildcard to select all columns, since only the StudentId column is required. \n'}

In [None]:
llm_chain_2("como funciona o método __call__ em python")

{'question': 'como funciona o método __call__ em python',
 'text': '/n como funciona o método __call__ em python? - /n como funciona o método __call__ em python /n como funciona o método __call__ em python\n<p>__call__ is a method in Python that takes another object as an argument and calls a method of that object. When a new instance of the object is created, the method gets called automatically. However, this behavior can be overridden with an explicit call using the __call__ method. So, to answer the question, __call__ is the method that takes another object as an argument and calls a function or method of that other object. It can be called directly without having a __call__ method defined.'}

In [None]:
llm_chain_2("show me how python's args and kwargs work")



{'question': "show me how python's args and kwargs work",
 'text': '/n\nHere\'s an explanation and example:\n\nIn Python, you can access command-line arguments and their types using the `args` and `kwargs` arguments of a `__main__()` function.\n\n`args` is a list of arguments passed to the function, which include the names of the arguments themselves, their data types, and their order.\n\nFor example:\n\n```python\ndef hello(name, message):\n    args = [\'-n\', \'name\']\n    if len(args) > 1:\n        print(args)\n    print(args[-1])\n    message("Hello, %s!" % name.title())\n\nhello()\n```\n\nIn this example, the `arg` variable contains the arguments `-n` and `name`, which means that the `name` argument is passed with a data type of `<class\'str\'>` and is the last argument passed (after the `-n`.\n\nSimilarly, `kwargs` is used to pass keyword arguments (arguments with data types and/or names). These include arguments that are defined in the `__init__()` method of an object.\n\nFor e

In [None]:
llm_chain_2("What's latency definition?")

{'question': "What's latency definition?",
 'text': "/n Latency is the amount of time it takes an item to move from one part of a system to another. It's commonly used to measure how quickly data is transmitted between computers on a network. /n "}

In [None]:
llm_chain_2("what is Python's ABC library and what is it for?")

{'question': "what is Python's ABC library and what is it for?",
 'text': '/n The ABC library stands for Absolute Backwards Compatible. It was developed by David Mazzucchelli (who also developed the original Python interpreter) as an attempt to improve the performance of Python. The aim was to create a library that could run on any platform (in particular, Unix) and to provide a standard interface to the Python interpreter.\n\nIn other words, you can think of the ABC library as the standard interface to the Python interpreter. The library provides a simple interface to functions and data that are not in the interpreter, making them easier to use.\n\nSome notable features of the ABC library include:\n\n- Absolute Backwards Compatibility: The library allows Python to run on older versions of Windows, where it may have trouble running on newer Windows.\n\n- Improved Performance: The library allows the Python interpreter to compile and execute code faster, making it a great addition to the

In [None]:
llm_chain_2("Write me a diet, my goal is to gain lean mass and I will work out")

{'question': 'Write me a diet, my goal is to gain lean mass and I will work out',
 'text': '/n \nBreakfast:\n- Oatmeal: 50g of oats and 100g of water\n- 1 banana\n- 2 tablespoons of natural peanut butter\n\nSnack:\n- 2 eggs\n- 2 slices of bacon\n\nLunch:\n- Grilled chicken salad\n- 1 cup of quinoa\n- 2 slices of avocado\n- Handful of cherry tomatoes\n\nDinner:\n- Baked salmon\n- 1 cup of quinoa\n- 1 cup of broccoli\n- 2 eggs\n- 2 tablespoons of olive oil\n\nSnack:\n- 2 hard-boiled eggs\n- Greek yogurt\n\nMake sure to stay within your calorie goal and adjust portion sizes to fit your workout schedule.'}

In [None]:
llm_chain_2("what is the difference between Similarity embeddings and search embeddings")



{'question': 'what is the difference between Similarity embeddings and search embeddings',
 'text': '/n \nSimilarity embeddings represent the relationship between two items in a high-dimensional space. Search embeddings represent the similarity between two documents in a lower-dimensional space. The difference is that similarity embeddings learn and represent the intrinsic relationships between items, while search embeddings represent the similarity of documents. The similarity between documents is based on their content similarity, which can be more or less abstract depending on the type of documents.'}

Conversation

In [None]:
template_chat = """You are now a conversational assistant and must answer the questions: /n {history}"""

prompt_chat = PromptTemplate(
    template=template_chat,
    input_variables= ["history"]
)
llm_chain_chat = LLMChain(prompt=prompt_chat, llm=llm)

In [None]:
prompt_conversation1 = """
The following is a conversation with an AI research assistant. The assistant tone is technical and scientific.
Human: Hello, who are you?
AI: Greeting! I am an AI research assistant. How can I help you today?
Human: Can you tell me about the creation of blackholes?
AI:
"""

In [None]:
llm_chain_chat(prompt_conversation1)



{'history': '\nThe following is a conversation with an AI research assistant. The assistant tone is technical and scientific.\nHuman: Hello, who are you?\nAI: Greeting! I am an AI research assistant. How can I help you today?\nHuman: Can you tell me about the creation of blackholes?\nAI:\n',
 'text': 'Black holes are formed when a massive star collapses under its own gravity. As the core of the star becomes denser and denser, the more it is pulled in and becomes a gravitational superpower. Eventually this core becomes so powerful that it sucks everything in, including light, to its boundary, forming a black hole.'}