# Welcome to crocodile Tutorial

In the next 20 minutes, you're expected to master the use of this friendly library. As an outcome, it will save you hundreds of hours in the future while doing mundane daily tasks, and, your code will be irreducibly succinct.

Intrigued? Let's get started

First things first, let's ascertain that you have the latest version

In [3]:
# !pip install --upgrade crocodile

### Always begin your Python files with this import

In [1]:
import crocodile.toolbox as tb

# Overview

We will be covering the following:
    
1- The `P` class (for Path).

2- The `L` class (for List).



### P Class

As a user of Python, you must has experienced the inconvenience of path handling. There is:
 * `os` module
 * `glob` module
 * `sys` module
 * `shutil` module
 * `pathlib` module instoduced in Python 3.5.
 * There is probably more that I'm not aware of.
 
Those are extremely verbose, many are archaic and kept for compatibility and they cost you a lot of time before they give you what you want.

##### Solution:
 
 `P` class elegantly solves this by converting mere **path strings** to **objects of type `P`**. 
 Strings are good for text parsing and processing, but they are rubbish for path management.
 
 `P` object by itself has *All* the necessary methods that could possibly be linked to it, therefore, no need for any other module to help getting things done.
 
 Let's give it a try: 

In [4]:
# wrap any string with this module to instantiate a `P` object and you are good to go.
p = tb.P.home().joinpath("tmp_results/tmp_folders/subsubfolder/file.txt")
p

👻NotExist 'file:\C:\Users\aalsaf01\tmp_results\tmp_folders\subsubfolder\file.txt'

Now, what can this `h` object offer to us?

Let's see some of its powers. Firstly, let's look at indexing.

Question:
* Doesn't just make sense that `h[0] = "folder"` ? 

In [6]:
p[0]  # Do you know that the old un-elegant way is: h.parent.parent.parent.parent.parent.parent ?

📁 'file:\C:' | 2019-03-19  15:07:21

In [7]:
p[-1]  

📍 Relative 'file.txt'

In [8]:
p[2:]  # slicing!

📍 Relative 'aalsaf01\tmp_results\tmp_folders\subsubfolder\file.txt'

In [9]:
p[[0, -1]]  # You did not see that coming ! fancy indexing!

👻NotExist 'file:\C:\file.txt'

In [11]:
r = p.switch_by_index(idx=2, val="new_name")
print(r)


C:\Users\new_name\tmp_results\tmp_folders\subsubfolder\file.txt


In [13]:
print(p.split(index=2))  # split by index
print(p.split(at="tmp_results"))  # split by directory name

(📁 'file:\C:\Users' | 2022-05-07  14:47:22, 📍 Relative 'aalsaf01\tmp_results\tmp_folders\subsubfolder\file.txt')
(📁 'file:\C:\Users\aalsaf01' | 2022-09-28  16:05:23, 📍 Relative 'tmp_results\tmp_folders\subsubfolder\file.txt')


But wait, this path, doesn't even exist!!
okay, let's create it


In [14]:
p.create()
print(p.exists())

True


If you like the good old methods that provide you with safeguards and checks when creating, then, they're all there. 

```
When developed, the library **never** overrides a method that was shipped with `pathlib.Path`
```


In [15]:
p.mkdir()

FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'C:\\Users\\aalsaf01\\tmp_results\\tmp_folders\\subsubfolder\\file.txt'

Do you like that error above? There you go. Thats the good old `mkdir`

In a similar fashion you will find the following:

* `delete` will give easy time compared to the existing `unlink` and `rmdir` and other methods.
  This will simply **delete**, no matter what, is it a folder? is it a file? is it empty if it is a folder?
    don't worry about anything. Just delete
    
    * There is also `send2trash` from the famous `send2trash` module which sends files to recycle bin.
        
* `create` is an easy way to do `mkdir` doesn't compain about existence and always creates the childers and parents required.

* `search` is an powerful and easy form of `glob` which still exists.


If you're used to `glob`, it is available as a method, otherwise you can see the `search` method. It takes a minute to learn what it does, so we content ourselves here by just inspecting its docstring

In [16]:
? tb.P.search

[1;31mSignature:[0m
 [0mtb[0m[1;33m.[0m[0mP[0m[1;33m.[0m[0msearch[0m[1;33m([0m[1;33m
[0m    [0mself[0m[1;33m,[0m[1;33m
[0m    [0mpattern[0m[1;33m:[0m [0mstr[0m [1;33m=[0m [1;34m'*'[0m[1;33m,[0m[1;33m
[0m    [0mr[0m[1;33m:[0m [0mbool[0m [1;33m=[0m [1;32mFalse[0m[1;33m,[0m[1;33m
[0m    [0mfiles[0m[1;33m:[0m [0mbool[0m [1;33m=[0m [1;32mTrue[0m[1;33m,[0m[1;33m
[0m    [0mfolders[0m[1;33m:[0m [0mbool[0m [1;33m=[0m [1;32mTrue[0m[1;33m,[0m[1;33m
[0m    [0mcompressed[0m[1;33m:[0m [0mbool[0m [1;33m=[0m [1;32mFalse[0m[1;33m,[0m[1;33m
[0m    [0mdotfiles[0m[1;33m:[0m [0mbool[0m [1;33m=[0m [1;32mFalse[0m[1;33m,[0m[1;33m
[0m    [0mfilters[0m[1;33m:[0m [0mOptional[0m[1;33m[[0m[0mlist[0m[1;33m[[0m[0mCallable[0m[1;33m[[0m[1;33m[[0m[0mAny[0m[1;33m][0m[1;33m,[0m [0mbool[0m[1;33m][0m[1;33m][0m[1;33m][0m [1;33m=[0m [1;32mNone[0m[1;33m,[0m[1;33m
[0m    [0mnot_in[

In [17]:
## Overloading operations 
# plus symbol: acts like plus for strings, i.e. concatenation.
print(p + "_new")


C:\Users\aalsaf01\tmp_results\tmp_folders\subsubfolder\file.txt_new


In [18]:
# Forward slash OPERATOR: joins paths
r = p / "haha"
# Note: tpis is not the same as the naive "/" concatenation which is platform-dependent
r = p + "/" + "haha"
# it is actually the same as 
p.joinpath("haha")

👻NotExist 'file:\C:\Users\aalsaf01\tmp_results\tmp_folders\subsubfolder\file.txt\haha'

In [19]:
# Some nifty methods for creating versions of the same file
print(p.prepend("this_is_a_prefix_"))
print(p.append("_this_come_after_name_but_before_suffix"))
# Incredibly useful when creating a variant of an existing file.

C:\Users\aalsaf01\tmp_results\tmp_folders\subsubfolder\this_is_a_prefix_file.txt
C:\Users\aalsaf01\tmp_results\tmp_folders\subsubfolder\file_this_come_after_name_but_before_suffix.txt


### Conclusion 

 Path Class: Designed with one goal in mind: any operation on paths MUST NOT take more than one line of code.
    It offers:
    * methods act on the underlying object in the disk drive: move, move_up, copy, encrypt, zip and delete.
    * methods act on the path object: parent, joinpath, switch, prepend, append
    * attributes of path: stem, trunk, size, date etc.
    

Have a looksee at the full list of metods:
Basically you have comprehensive set of methods to do any of:

* Path manipulation.
* Searching directories.
* Getting files and folders specs, e.g. size and time etc.
* File manangement capabilities, e.g. delete, copy, compression of files and folders with one line, to anywhere.

Spend the next 30 seconds to inspect the names of the methods, and they will spring to your mind whenever you need them later. Always remember that there is a method to do what you want in one line. If not, and you think it worth a method, please suggest it on Github.

#### Note 1: if there are any modules out there that do not understand this Path object, then you can easily convert back to the string with ``str(h)`` or ``h.string`` when needed, on the fly, as you pass the parameter.

#### Note 2: The forward slash "/" works nicely on all platforms, so use it if writing **path string** manually. 


In [20]:
dir(tb.P)

['__add__',
 '__bytes__',
 '__call__',
 '__class__',
 '__contains__',
 '__deepcopy__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__fspath__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__radd__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rtruediv__',
 '__setattr__',
 '__setitem__',
 '__setstate__',
 '__sizeof__',
 '__slots__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__weakref__',
 '_cached_cparts',
 '_cparts',
 '_drv',
 '_flavour',
 '_format_parsed_parts',
 '_from_parsed_parts',
 '_from_parts',
 '_hash',
 '_make_child',
 '_make_child_relpath',
 '_parse_args',
 '_parts',
 '_pparts',
 '_resolve_path',
 '_return',
 '_root',
 '_scandir',
 '_str',
 '_type',
 'absolute',
 'anchor',
 'append',
 'append_text',
 'as_posix',
 'as_str',

Last but not least, there is: `tb.P.tmp` and its alias `tb.tmp`, an exredibly useful thingy to store temporary files conveniently outside your current coding directory. It creates a folder called `tmp_results` in your home directory. so you can put your results and files there temprarily.

In [21]:
print(tb.tmp())

C:\Users\aalsaf01\tmp_results


In [22]:
string = "This is a string to be saved in a text file"

(tb.tmp() / "txtfile.txt").write_text(string)

📄 'file:\C:\Users\aalsaf01\tmp_results\txtfile.txt' | 2023-10-15  10:18:58 | 0.0 Mb

Now let's inspect this directory with our computer file explorer.

Did you see the file?

In [23]:
tb.tmp()()  # .explore is a method of `P`, it opens the files and directories using system defaults.
# e.g. if you have an image path, you can open it with your default system image viewer with this method.

📁 'file:\C:\Users\aalsaf01\tmp_results' | 2023-02-21  11:17:40

# The Struct Module

This offers a very convenient way to keep bits and sundry items in a little container with easy to use synatx. More specifically, it extends dict such that it enables accessing items using both dot notation and keys.

Let's give it a try:

In [36]:
x = tb.Struct(a=2, b=3)  

x.a == x['a']

True

## Save Methods

Almost certainly, you want to save your hyperparameters to reuse them somewhere else or later, so let's do that

In [37]:
x.save_json(tb.tmp() / "my_config.json")

Struct: [a, b, ]

Guess what? there's more:

* ``save_json``  Excellent for Config files as it provides human readable format.
* ``save_npy``  Excellent for numerical data.
* ``save_pickle``  Generic.
* ``save_mat``  For passing data to Matlab animals.

Guess what?

These methods are available for all classes, `List`, `Struct` and even `P`. What's more? Well, you can equip all of your own **existing** classes with these capabilities by simply adding to the their inheritance one word ``tb.Base``.

Later, to load up anything, you run the class method `from_saved` from any class and pass the path.

Do you miss the dict? or do you have a class that only understand dict objects?
You can convert back to dict on the fly with `x.dict` `x.__dict__`

In [38]:
x.dict

{'a': 2, 'b': 3}

In [39]:
x.print()

Structure, with following entries:
Key                    Item Type                    Item Details
---                    ---------                    ------------
a                      int                          2
b                      int                          3



# The List Module

``tb.List`` or its alias `tb.L` offers a class with a single attribute named `list`, which is a Python `list`. The class gives an enhanced Javascript functionality of `toEach` method of arrays.

*Use this class whenever you have objects of the same type and you want to containerize them.*

In [40]:
a = tb.L([1, 2., 3.2])

In [42]:
a.print()

 0- 1 
 1- 2.0 
 2- 3.2 


##### let's do some implicit for loops

In [44]:
a.apply(int).print()  # this is like .toEach() in JS arrays, but it is even better.

 0- 1 
 1- 2 
 2- 3 


### Let's do some heavy lifting...
Did you know that: search results returned by `P` object are more `P` objects containerized in `List` object?


This is too much power! `L` and `P` are working together!!

In [45]:
results = tb.P.home().search("*")
print(type(results))  # it is a `List` Object!

<class 'crocodile.toolbox.List'>


In [48]:
results

List object with 37 elements. One example of those elements: 
P: C:\Users\Alex\3D Objects

In [None]:
results.print()  # nice print function if you did not like the __repr__ method which is very succinct.

In [49]:
results.time()

List object with 37 elements. One example of those elements: 
datetime.datetime(2020, 12, 9, 12, 12, 16, 990216)

Wait: How did that happen?

`time` is a method of `P` that tells what time the file was created.

However, the method was run against a `List` object, but this internally called in a for loop over the individual items.

In other words, the above is short for `results.apply(lambda x: x.time())`


## Object Modifications

You can manipulate the objects containerized in `L`. The result returned is another `L` which encapsulates the outcome of modification or whatever function applied to the objects, like so:

In [50]:
results.apply(lambda x: x[1:3])

List object with 37 elements. One example of those elements: 
P: Users\Alex

### TEST!

Do you reckon that you're now on top of the library?

Yes?

Put that assertion to test!

To calculate how many lines of code are in `crocodile` so far, we run this

In [37]:
tb.P(tb.__file__).parent.search("*.py").read_text(encoding="utf-8").split("\n").apply(len).to_numpy().sum()

4530

Is it obvious to you what happened there?

**Can you write more one-liners like that?**

# Log class

"""This class is needed once a project grows beyond simple work. Simple print statements from  dozens of objects will not be useful as the programmer will not easily recognize who
      is printing this message, in addition to many other concerns.

     Advantages of using instances of this class: You do not need to worry about object pickling process by modifing
     the __getstate__ method of the class that will own the logger. This is the case because loggers lose access
     to the file logger when unpickled, so it is better to instantiate them again.
     Logger can be pickled, but its handlers are lost, so what's the point? no perfect reconstruction.
     Additionally, this class keeps track of log files used, append to them if they still exist.

     Implementation detail: the design favours composition over inheritence. To counter the inconvenience
      of having extra typing to reach the logger, a property `logger` was added to Base class to refer to it."""