## Get Size of s3 objects

Let us go through the details about how we can get size of s3 objects using `MaxKeys` and `Marker`. We will improvise on top of getting count of s3 objects.

* Here is the code used to get count of objects in s3.

```python
marker = ''
object_count = 0
while True:
    s3_objects = s3_client.list_objects(
        Bucket='itv-genlogs',
        Prefix='logs/year',
        Marker=marker,
        MaxKeys=200
    ).get('Contents')
    if not s3_objects:
        break
    object_count += len(s3_objects)
    marker = s3_objects[-1]['Key']
    print(marker)
```

* Create client with appropriate profile.
* Invoke `list_objects` in pages using `MaxKeys` and `Marker`.
* Each entry in the output of `list_objects` contain `Size` along with `Key` and other details.
* Add the Size of each entry to get the total size of our s3 Bucket. The size in each entry will be in Bytes and you might have to convert to mega bytes.

In [1]:
import boto3

In [2]:
import os
os.environ.setdefault('AWS_PROFILE', 'itvgenlogs')

'itvgenlogs'

In [3]:
s3_client = boto3.client('s3')

In [4]:
s3_objects = s3_client.list_objects(
    Bucket='itv-genlogs-mana00',
    Prefix='logs/year'
)

In [5]:
s3_objects.keys()

dict_keys(['ResponseMetadata', 'IsTruncated', 'Marker', 'Contents', 'Name', 'Prefix', 'MaxKeys', 'EncodingType'])

In [6]:
s3_objects['Contents'][0]

{'Key': 'logs/year=2024/month=02/day=16/gen_logs_s3-1-2024-02-16-12-51-20-2b9337a1-7e5c-4a5d-84cc-3587b8d40e07',
 'LastModified': datetime.datetime(2024, 2, 16, 12, 52, 22, tzinfo=tzutc()),
 'ETag': '"d7592fc75dbb74e210dfe695cab29f83"',
 'Size': 24450,
 'StorageClass': 'STANDARD',
 'Owner': {'DisplayName': 'laiaddara',
  'ID': '709f2485bbc57aa0687e130826e0d8c48d3beaba7e7f08305a5a39db5536f4f3'}}

In [7]:
s3_objects['Contents'][0]['Size']

24450

In [8]:
objects_size = 0.0

for s3_object in s3_objects['Contents']:
    objects_size += s3_object['Size']

In [9]:
objects_size

97512.0

In [10]:
!pip install hurry.filesize

Collecting hurry.filesize
  Downloading hurry.filesize-0.9.tar.gz (2.8 kB)
Building wheels for collected packages: hurry.filesize
  Building wheel for hurry.filesize (setup.py) ... [?25lerror
[31m  ERROR: Command errored out with exit status 1:
   command: /home/wahaha/Projects/Internal/GenlogsS3/GenLogsS3-venv/bin/python3 -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-tl3s2szx/hurry.filesize/setup.py'"'"'; __file__='"'"'/tmp/pip-install-tl3s2szx/hurry.filesize/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-54o_yuv8
       cwd: /tmp/pip-install-tl3s2szx/hurry.filesize/
  Complete output (6 lines):
  usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
     or: setup.py --help [cmd1 cmd2 ...]
     or: setup.py --help-commands
     or: setup.py cmd --help
  
  error: invalid command '

In [11]:
from hurry.filesize import size
size(objects_size)

ModuleNotFoundError: No module named 'hurry'

In [None]:
marker = ''
objects_size = 0.0
while True:
    s3_objects = s3_client.list_objects(
        Bucket='itv-genlogs-mana00',
        Prefix='logs/year',
        Marker=marker,
        MaxKeys=200
    ).get('Contents')
    if not s3_objects:
        break
    for s3_object in s3_objects:
        objects_size += s3_object['Size']
    marker = s3_objects[-1]['Key']
    print(marker)

In [None]:
objects_size

In [None]:
size(objects_size)