-
-
Notifications
You must be signed in to change notification settings - Fork 301
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
be62050
commit 355c673
Showing
7 changed files
with
301 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
<!-- | ||
Copyright (c) 2019-2026, Hossein Moein | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
* Neither the name of Hossein Moein and/or the DataFrame nor the | ||
names of its contributors may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL Hossein Moein BE LIABLE FOR ANY | ||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
--> | ||
<!DOCTYPE html> | ||
<html> | ||
<body> | ||
|
||
Watch this video: <a href="https://www.youtube.com/watch?v=WDIkqP4JbkE">Scott Meyers: Cpu Caches and Why You Care</a>. Enough said | ||
<BR> | ||
<BR> | ||
<BR> | ||
|
||
<img src="https://github.com/hosseinmoein/DataFrame/blob/master/docs/LionLookingUp.jpg?raw=true" alt="C++ DataFrame" | ||
width="200" height="150" style="float:right"/> | ||
|
||
</body> | ||
</html> | ||
|
||
<!-- | ||
Local Variables: | ||
mode:HTML | ||
tab-width:4 | ||
c-basic-offset:4 | ||
End: | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
<!-- | ||
Copyright (c) 2019-2026, Hossein Moein | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
* Neither the name of Hossein Moein and/or the DataFrame nor the | ||
names of its contributors may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL Hossein Moein BE LIABLE FOR ANY | ||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
--> | ||
<!DOCTYPE html> | ||
<html> | ||
<body> | ||
|
||
Copying data obviously causes space and time being wasted. But it also could have other side effects. For example, the construction and/or destruction of objects being copied may have side effects. Also, some object may be consumers of expensive resources and copying them may increase resource consumption. | ||
<BR> | ||
<BR> | ||
<BR> | ||
|
||
<img src="https://github.com/hosseinmoein/DataFrame/blob/master/docs/LionLookingUp.jpg?raw=true" alt="C++ DataFrame" | ||
width="200" height="150" style="float:right"/> | ||
|
||
</body> | ||
</html> | ||
|
||
<!-- | ||
Local Variables: | ||
mode:HTML | ||
tab-width:4 | ||
c-basic-offset:4 | ||
End: | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
<!-- | ||
Copyright (c) 2019-2026, Hossein Moein | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
* Neither the name of Hossein Moein and/or the DataFrame nor the | ||
names of its contributors may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL Hossein Moein BE LIABLE FOR ANY | ||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
--> | ||
<!DOCTYPE html> | ||
<html> | ||
<body> | ||
|
||
<I>Garbage In, Garbage Out</I> is when a user inputs a dataset that is not suitable for a particular analysis, and (s)he gets nonsense as a result. For example, a data analysis algorithm may require that the input to be normally distributed. If user inputs a dataset that is far from being normal distribution, the result will be garbage.<BR> | ||
One approach would be for the library to check the data first and warn the user that the data is not suitable. DataFrame doesn’t do that because this approach has many pitfalls: | ||
<OL> | ||
<LI>It makes the code inefficient and slow</LI> | ||
<LI>It makes the code convoluted and hard to understand and maintain</LI> | ||
<LI>The check often is more complicated than the algorithm itself and it makes the code bug-prone</LI> | ||
</OL> | ||
|
||
<BR> | ||
<BR> | ||
<BR> | ||
|
||
<img src="https://github.com/hosseinmoein/DataFrame/blob/master/docs/LionLookingUp.jpg?raw=true" alt="C++ DataFrame" | ||
width="200" height="150" style="float:right"/> | ||
|
||
</body> | ||
</html> | ||
|
||
<!-- | ||
Local Variables: | ||
mode:HTML | ||
tab-width:4 | ||
c-basic-offset:4 | ||
End: | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
<!-- | ||
Copyright (c) 2019-2026, Hossein Moein | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
* Neither the name of Hossein Moein and/or the DataFrame nor the | ||
names of its contributors may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL Hossein Moein BE LIABLE FOR ANY | ||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
--> | ||
<!DOCTYPE html> | ||
<html> | ||
<body> | ||
|
||
Multithreading is used for two main reasons; one is to increase throughput and performance and other is to make the code more succinct and structured.<BR> | ||
Using multithreading to increase throughput and performance is very tricky and often it is counterproductive. Unfortunately, it is very hard to have a <I>generic</I> multithreading solution for all problems. It is heavily dependent on the nature of the problem and the hardware/software platform. That is why the multithreading solution in DataFrame is very tunable and requires careful user adjustments. I suggest to always start with a single thread and later when the system is working correctly experiment with multithreading.<BR> | ||
Also, watch this video: <a href="https://www.youtube.com/watch?v=WDIkqP4JbkE">Scott Meyers: Cpu Caches and Why You Care</a> | ||
<BR> | ||
<BR> | ||
<BR> | ||
|
||
<img src="https://github.com/hosseinmoein/DataFrame/blob/master/docs/LionLookingUp.jpg?raw=true" alt="C++ DataFrame" | ||
width="200" height="150" style="float:right"/> | ||
|
||
</body> | ||
</html> | ||
|
||
<!-- | ||
Local Variables: | ||
mode:HTML | ||
tab-width:4 | ||
c-basic-offset:4 | ||
End: | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
<!-- | ||
Copyright (c) 2019-2026, Hossein Moein | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
* Neither the name of Hossein Moein and/or the DataFrame nor the | ||
names of its contributors may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL Hossein Moein BE LIABLE FOR ANY | ||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
--> | ||
<!DOCTYPE html> | ||
<html> | ||
<body> | ||
|
||
Containers of pointers to data are very inefficient because of two main reasons:<BR> | ||
<OL> | ||
<LI>Each pointer points to a "different" memory location. In other words, the data is not stored contiguously. That breaks cache locality and therefore it is a major inefficiency.</LI> | ||
<LI>To access data through a pointer, in general, you must do two things; first you must dereference the pointer and then access the data it is pointing to. For large datasets it becomes an inefficiency.</LI> | ||
</OL> | ||
|
||
<BR> | ||
<BR> | ||
<BR> | ||
|
||
<img src="https://github.com/hosseinmoein/DataFrame/blob/master/docs/LionLookingUp.jpg?raw=true" alt="C++ DataFrame" | ||
width="200" height="150" style="float:right"/> | ||
|
||
</body> | ||
</html> | ||
|
||
<!-- | ||
Local Variables: | ||
mode:HTML | ||
tab-width:4 | ||
c-basic-offset:4 | ||
End: | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
<!-- | ||
Copyright (c) 2019-2026, Hossein Moein | ||
All rights reserved. | ||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
* Redistributions of source code must retain the above copyright | ||
notice, this list of conditions and the following disclaimer. | ||
* Redistributions in binary form must reproduce the above copyright | ||
notice, this list of conditions and the following disclaimer in the | ||
documentation and/or other materials provided with the distribution. | ||
* Neither the name of Hossein Moein and/or the DataFrame nor the | ||
names of its contributors may be used to endorse or promote products | ||
derived from this software without specific prior written permission. | ||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND | ||
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | ||
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL Hossein Moein BE LIABLE FOR ANY | ||
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES | ||
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | ||
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND | ||
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS | ||
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
--> | ||
<!DOCTYPE html> | ||
<html> | ||
<body> | ||
|
||
For small to small-medium size datasets std::variant (aka type-safe union) is a neat trick. For that kind of datasets combining std::variant with std::visit can be a good substitute for runtime polymorphism. But for large datasets std::variant has two main problems:<BR> | ||
<OL> | ||
<LI>There is extra code that compilers/users must insert to figure out the type of each item inside std::variant in conjunction with std::visit and what to do with it</LI> | ||
<LI>Size of std::variant is the size of its largest member. So, you could be wasting significant memory space</LI> | ||
</OL> | ||
<BR> | ||
<BR> | ||
<BR> | ||
|
||
<img src="https://github.com/hosseinmoein/DataFrame/blob/master/docs/LionLookingUp.jpg?raw=true" alt="C++ DataFrame" | ||
width="200" height="150" style="float:right"/> | ||
|
||
</body> | ||
</html> | ||
|
||
<!-- | ||
Local Variables: | ||
mode:HTML | ||
tab-width:4 | ||
c-basic-offset:4 | ||
End: | ||
--> |