# Testining init() and create_collection() functions from datashelf.core

Tested the following:
- init() new datashelf
- Try to init() when .datashelf/ already exists -> should return a message and do nothing
- Removed .datashelf/ directory and tried to create a collection -> should return a TypeError
- Test init() and creat_collection() concurrently -> collection should be made with metadata file with some metadata prepopulated
- Try to create a collection that alread exists -> should return a message and do nothing
- Removed metadata file of collection
- Try to create a collection when the collection exists but doesn't have a metadata file -> should return a message and create the metadata file

For this set of tests, import init and create_collection from datashelf.core. You'll also need os and shutil for directory paths.

In [1]:
from datashelf.core import init, create_collection
import os
import shutil

First, init a new datashelf. Make sure to run this in your project's root directory (or not, but that's where it makes the most sense!). 

In [2]:
# Create datashelf when DNE
init()

2025-07-20 22:33:58,813 - INFO - .datashelf directory and metadata already initalized.


init() should return a message telling you that a .datashelf directory already exists if you try to run it in the same directory after already running init().

In [3]:
# Try to init when .datashelf already exists
init()

2025-07-20 22:34:00,226 - INFO - .datashelf directory and metadata already initalized.


Define a small function to delete a datashelf directory (I should probably add this to a module).

In [4]:
def rm_datashelf_dir():
    try:
        dir_path = os.path.join(os.getcwd(), ".datashelf")
        shutil.rmtree(dir_path)
        print(f"Directory '{dir_path}' and its contents successfully deleted.")
    except OSError as e:
        print(f"Error deleting directory: {e}")

Use create_collection() to create a test collection. The folder name of the collection should be all lowercase and should have underscores instead of spaces. It should also automatically generate a YAML metadata file with the name {collection name}_metadata.yaml, with collection name being all lowercase and having underscores instead of spaces.

In [5]:
# Create a collection when .datashelf exists - should also populate some basic metadata
create_collection("test collection")

2025-07-20 22:34:02,313 - INFO - collection directory: test collection and metadata file test_collection_metadata.yaml created.


Now, remove the datashelf directory with the funciton you created above. If you run create_collection now, it should through a TypeError.

In [6]:
# Delete .datashelf and rerun create_collection to confirm that it raises an error - should be a TypeError
rm_datashelf_dir()
create_collection("Collection 1")

Directory '/Users/rohankrishnan/Documents/GitHub/datashelf/.datashelf' and its contents successfully deleted.


TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

Create a new .datashelf directory and create a new test collection.

In [7]:
init()
create_collection("Collection 1")

2025-07-20 22:34:06,331 - INFO - .datashelf directory and metadata initialized
2025-07-20 22:34:06,336 - INFO - collection directory: Collection 1 and metadata file collection_1_metadata.yaml created.


If you try to create another collection with the same name, you should get a log saying that collection already exists.

In [8]:
# Try to create a collection that already exists
create_collection("Collection 1")

2025-07-20 22:34:07,460 - INFO - collection already exists at /Users/rohankrishnan/Documents/GitHub/datashelf/.datashelf/collection_1 with metadata file: collection_1_metadata.yaml.


Define a function to remove the metadata file of a datashelf collection (idt this is useful outside of testing but I should probably put this function somewhere more permanent). 

In [9]:
# Function to remove collection metadata.json file
def rm_collection_metadata():
    if os.path.exists(os.path.join(os.getcwd(), ".datashelf/collection_1/collection_1_metadata.yaml")):
        try:
            os.remove(os.path.join(os.getcwd(), ".datashelf/collection_1/collection_1_metadata.yaml"))
            print(f"Metadata deleted successfully.")
        except Exception as e:
            print(f"An error occured: {e}")
    else:
        print(f"Metadata does not exist.")

Test that function works.

In [10]:
rm_collection_metadata()


Metadata deleted successfully.


In [11]:
rm_collection_metadata()


Metadata does not exist.


If you try to create a collection and create_collection() finds that it already exists but it doesn't have a metadata file (idk when this would be the case), then it will let you know that it exists and just create a new metadata file.

In [12]:
# Try to create a collection when collection exists but doesn't have metadata file - should also populate with basic metadata
create_collection("Collection 1")

2025-07-20 22:34:13,406 - INFO - collection directory: Collection 1 exists but does not have a metadata file.
Creating collection_1_metadata.yaml...
2025-07-20 22:34:13,409 - INFO - Metadata file: collection_1_metadata.yaml created in /Users/rohankrishnan/Documents/GitHub/datashelf/.datashelf/collection_1.
